Hello,
I wonder if the number of computations required to compute a derivative structure can be reduced. Say the derivative structure of f to order d=2 is desired: f(g1(x1), g2(x2), ... gn(xn)) where x1,x2 ... xn are subsets of variables that are not necessarily disjoint. For each gi(xi), O(n^2) variables need to be computed while I only need [xi]^2 different variables. (Where [xi] stands for the cardinality of the set xi) In the extreme case say x1...xn are single variables (sets of cardinality 1): To compute the derivative structure to order 2 of: f(sin(x1)+cos(x2)+... tan(xn)) O(n^2) values for each of the n components needs to be computed (or at least memory allocated for that much): so that is a total of over O(n^3) values. If there were a way of computing only derivatives of relevant variables, each term of f() would require only 2 calculations or a total of 2n calculations. So, is there a way to compute the derivative of each component separately and "join" them together? When the derivative structure is sparse, this should reduce computation required from O(n^3) to O(n) ... quite a large saving when n is large. Any comments on this? ... or errors in this line of though? Thanks, Ajo. |
Hi Ajo,
Le 29/08/2013 19:05, Ajo Fod a écrit : > Hello, > > I wonder if the number of computations required to compute a derivative > structure can be reduced. Say the derivative structure of f to order d=2 is > desired: > f(g1(x1), g2(x2), ... gn(xn)) where x1,x2 ... xn are subsets of variables > that are not necessarily disjoint. > > For each gi(xi), O(n^2) variables need to be computed while I only need > [xi]^2 different variables. (Where [xi] stands for the cardinality of the > set xi) > > In the extreme case say x1...xn are single variables (sets of cardinality > 1): > To compute the derivative structure to order 2 of: > f(sin(x1)+cos(x2)+... tan(xn)) > O(n^2) values for each of the n components needs to be computed (or at > least memory allocated for that much): > so that is a total of over O(n^3) values. > > If there were a way of computing only derivatives of relevant variables, > each term of f() would require only 2 calculations or a total of 2n > calculations. > > So, is there a way to compute the derivative of each component separately > and "join" them together? When the derivative structure is sparse, this > should reduce computation required from O(n^3) to O(n) ... quite a large > saving when n is large. It is possible, but implies doing all the "join" work by yourself. The first step is to compute each gi function using DerivativeStructure instances configured to one parameter only (regardless of i) and second order: DerivativeStructure x1 = new DerivativeStructure(1, 2, 0, x1Value); DerivativeStructure g1x1 = g1.value(x1); DerivativeStructure x2 = new DerivativeStructure(1, 2, 0, x2Value); DerivativeStructure g2x2 = g2.value(x2); ... DerivativeStructure xn = new DerivativeStructure(1, 2, 0, xnValue); DerivativeStructure gnxn = gn.value(xn); As each gi.value method is called, it gets the value for one variable only and is not aware there are other instances lying around. It only computes gi, dgi/dxi, d^2gi/dxi^2, so three numbers each. The number of computations here remains linear in the number of variables. Now comes the tricky and manual part: computing f. This method must be aware of the full variable set for which you want partial derivatives. You can do that by "expanding" your DerivativeStructure instances to add 0 derivatives. In the case the xi all have cardinality 1 and you want xi to be mapped to index i-1 in the n-cardinality full set, you can use something similar to: public DerivativeStructure mapGToFullSet(final int parameters, final int index, final DerivativeStructure g) { final int order = g.getOrder(); final DSCompiler compiler = DSCompiler.getCompiler(parameters, index); final double[] derivatives = new double[compiler.getSize()]; final int[] orders = new int[order + 1]; // copy d^k g / dx^k at its dedicated index in the expanded data for (int k = 0; k <= order; ++k) { order[index] = k; derivatives[compiler.getPartialDerivativeIndex(orders)] = g.getPartialDarivatives(k); order[index] = O; } return new DerivativeStructure(parameters, order, derivatives); } So you would do DerivativeStructure g1 = mapGToFullSet(n, 0, g1x1); DerivativeStructure g2 = mapGToFullSet(n, 1, g2x2); ... DerivativeStructure gn = mapGToFullSet(n, n - 1, gnxn); And finally DerivativeStructure fx1xn = f.value(g1, g2, ..., gn); The mapping part allocates n full derivatives arrays, but only initialize very few elements in them, letting most of the elements set to 0. It also don't compute anything, it is simply data copying. It is probably O(n^2) for order two derivatives, but with a small coefficient. The f.value() part is a full-blown computation of all derivatives. It may reamin costly as is. In the general case you would save computation on the gi, not on f. If f is really a univariate function (like in your f(sin(x1)+cos(x2)+... tan(xn)) case), it may be possible to save a little more. You would have to compute first y = sin(x1)+cos(x2)+... tan(xn) with all derivatives, then build an intermediate single variable yTmp and apply f to it (i.e. the internal computation in f would only consider a single variable), and then combine the derivatives as follows: DerivativeStructure y = ...; // full sin(x1)+cos(x2)+... DerivativeStructure yTmp = new DerivativeStructure(1, 2, 0, y.getValue()); DerivativeStructure fTmp = f.value(yTmp); // only one variable, fast! DerivativeStructure z = y.compose(fTmp.getAllDerivatives()); return z; Note that if the xi variables are not cardianlity 1 but rather a subset of a larger set (say x1 is (a, b), x2 is (a, c), ... xn is (a, b, c)), then the mapping function becomes more complicated. It should use the compilers of both th initial reduced set and the complete set (in the example above our compiler instance was only for the complete set), and it should also create an orders derivatives array for the reducced set (in the example above, we directly used g.getPartialDarivatives(k) because in the special case of one parameter only, it is guaranteed that the index and the derivation order match). The mapping function should also use some indirection table to know for each variable x1, x2 ... what is the reduced set that was used to compute the derivatives (in the example above, we simply used an index argument telling the index of the single variable in the full set). I don't know if the most general mapping function would really be interesting or not. If you feel like contributing such a function, I would be happy to review it. best regards, Luc > > Any comments on this? ... or errors in this line of though? > > Thanks, > Ajo. > --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
Thanks for this analysis. In my case, I require the more generic version
with overlapping variable sets for component functions. Also, I need to worry about multiplication of component functions too. I'll think about the most efficient way to do this. Cheers, Ajo. On Fri, Aug 30, 2013 at 12:21 AM, Luc Maisonobe <[hidden email]>wrote: > Hi Ajo, > > Le 29/08/2013 19:05, Ajo Fod a écrit : > > Hello, > > > > I wonder if the number of computations required to compute a derivative > > structure can be reduced. Say the derivative structure of f to order d=2 > is > > desired: > > f(g1(x1), g2(x2), ... gn(xn)) where x1,x2 ... xn are subsets of variables > > that are not necessarily disjoint. > > > > For each gi(xi), O(n^2) variables need to be computed while I only need > > [xi]^2 different variables. (Where [xi] stands for the cardinality of the > > set xi) > > > > In the extreme case say x1...xn are single variables (sets of cardinality > > 1): > > To compute the derivative structure to order 2 of: > > f(sin(x1)+cos(x2)+... tan(xn)) > > O(n^2) values for each of the n components needs to be computed (or at > > least memory allocated for that much): > > so that is a total of over O(n^3) values. > > > > If there were a way of computing only derivatives of relevant variables, > > each term of f() would require only 2 calculations or a total of 2n > > calculations. > > > > So, is there a way to compute the derivative of each component separately > > and "join" them together? When the derivative structure is sparse, this > > should reduce computation required from O(n^3) to O(n) ... quite a large > > saving when n is large. > > It is possible, but implies doing all the "join" work by yourself. > > The first step is to compute each gi function using DerivativeStructure > instances configured to one parameter only (regardless of i) and second > order: > > DerivativeStructure x1 = new DerivativeStructure(1, 2, 0, x1Value); > DerivativeStructure g1x1 = g1.value(x1); > DerivativeStructure x2 = new DerivativeStructure(1, 2, 0, x2Value); > DerivativeStructure g2x2 = g2.value(x2); > ... > DerivativeStructure xn = new DerivativeStructure(1, 2, 0, xnValue); > DerivativeStructure gnxn = gn.value(xn); > > As each gi.value method is called, it gets the value for one variable > only and is not aware there are other instances lying around. It only > computes gi, dgi/dxi, d^2gi/dxi^2, so three numbers each. The number of > computations here remains linear in the number of variables. > > Now comes the tricky and manual part: computing f. This method must be > aware of the full variable set for which you want partial derivatives. > You can do that by "expanding" your DerivativeStructure instances to add > 0 derivatives. In the case the xi all have cardinality 1 and you want xi > to be mapped to index i-1 in the n-cardinality full set, you can use > something similar to: > > public DerivativeStructure mapGToFullSet(final int parameters, > final int index, > final DerivativeStructure g) { > final int order = g.getOrder(); > final DSCompiler compiler = > DSCompiler.getCompiler(parameters, index); > final double[] derivatives = new double[compiler.getSize()]; > final int[] orders = new int[order + 1]; > > // copy d^k g / dx^k at its dedicated index in the expanded data > for (int k = 0; k <= order; ++k) { > order[index] = k; > derivatives[compiler.getPartialDerivativeIndex(orders)] = > g.getPartialDarivatives(k); > order[index] = O; > } > > return new DerivativeStructure(parameters, order, derivatives); > > } > > So you would do > > DerivativeStructure g1 = mapGToFullSet(n, 0, g1x1); > DerivativeStructure g2 = mapGToFullSet(n, 1, g2x2); > ... > DerivativeStructure gn = mapGToFullSet(n, n - 1, gnxn); > > And finally > > DerivativeStructure fx1xn = f.value(g1, g2, ..., gn); > > The mapping part allocates n full derivatives arrays, but only > initialize very few elements in them, letting most of the elements set > to 0. It also don't compute anything, it is simply data copying. It is > probably O(n^2) for order two derivatives, but with a small coefficient. > > The f.value() part is a full-blown computation of all derivatives. It > may reamin costly as is. In the general case you would save computation > on the gi, not on f. If f is really a univariate function (like in your > f(sin(x1)+cos(x2)+... tan(xn)) case), it may be possible to save a > little more. You would have to compute first y = sin(x1)+cos(x2)+... > tan(xn) with all derivatives, then build an intermediate single variable > yTmp and apply f to it (i.e. the internal computation in f would only > consider a single variable), and then combine the derivatives as follows: > > DerivativeStructure y = ...; // full sin(x1)+cos(x2)+... > DerivativeStructure yTmp = > new DerivativeStructure(1, 2, 0, y.getValue()); > DerivativeStructure fTmp = f.value(yTmp); // only one variable, fast! > DerivativeStructure z = y.compose(fTmp.getAllDerivatives()); > return z; > > > Note that if the xi variables are not cardianlity 1 but rather a subset > of a larger set (say x1 is (a, b), x2 is (a, c), ... xn is (a, b, c)), > then the mapping function becomes more complicated. It should use the > compilers of both th initial reduced set and the complete set (in the > example above our compiler instance was only for the complete set), and > it should also create an orders derivatives array for the reducced set > (in the example above, we directly used g.getPartialDarivatives(k) > because in the special case of one parameter only, it is guaranteed that > the index and the derivation order match). The mapping function should > also use some indirection table to know for each variable x1, x2 ... > what is the reduced set that was used to compute the derivatives (in the > example above, we simply used an index argument telling the index of the > single variable in the full set). I don't know if the most general > mapping function would really be interesting or not. If you feel like > contributing such a function, I would be happy to review it. > > best regards, > Luc > > > > > Any comments on this? ... or errors in this line of though? > > > > Thanks, > > Ajo. > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [hidden email] > For additional commands, e-mail: [hidden email] > > |
Free forum by Nabble | Edit this page |