Seems like the DerivativeCompiler returns NaN.
IMHO it should return 0. Is this worthy of an issue? Thanks, -Ajo |
Hi!
IMHO, 0^x should equal 0. But note please, that 0^0 is not define, so it should be NaN --- TIA, Rodion Ajo Fod писал 23.08.2013 05:17 PM: > Seems like the DerivativeCompiler returns NaN. > > IMHO it should return 0. > > Is this worthy of an issue? > > Thanks, > -Ajo --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Ajo Fod
On Fri, 23 Aug 2013 07:17:35 -0700, Ajo Fod wrote:
> Seems like the DerivativeCompiler returns NaN. > > IMHO it should return 0. What should be 0? And Why? > > Is this worthy of an issue? As is, no. Gilles > > Thanks, > -Ajo --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
Try this and I'm happy to explain if necessary:
public class Derivative { public static void main(final String[] args) { DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, 1d); System.out.println("Derivative of constant^x wrt x"); for (int a = -3; a < 3; a++) { final DerivativeStructure a_ds = new DerivativeStructure(1, 1, a); final DerivativeStructure out = a_ds.pow(dsA); System.out.format("Derivative@%d=%f\n", a, out.getPartialDerivative(new int[]{1})); } } } On Fri, Aug 23, 2013 at 7:59 AM, Gilles <[hidden email]>wrote: > On Fri, 23 Aug 2013 07:17:35 -0700, Ajo Fod wrote: > >> Seems like the DerivativeCompiler returns NaN. >> >> IMHO it should return 0. >> > > What should be 0? And Why? > > > >> Is this worthy of an issue? >> > > As is, no. > > Gilles > > >> Thanks, >> -Ajo >> > > > ------------------------------**------------------------------**--------- > To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org<[hidden email]> > For additional commands, e-mail: [hidden email] > > |
In reply to this post by Ajo Fod
Search the archives for discussion of 0^0.
Phil On 8/23/13 7:17 AM, Ajo Fod wrote: > Seems like the DerivativeCompiler returns NaN. > > IMHO it should return 0. > > Is this worthy of an issue? > > Thanks, > -Ajo > --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Ajo Fod
Hi Ajo,
Le 23/08/2013 17:48, Ajo Fod a écrit : > Try this and I'm happy to explain if necessary: > > public class Derivative { > > public static void main(final String[] args) { > DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, 1d); > System.out.println("Derivative of constant^x wrt x"); > for (int a = -3; a < 3; a++) { We have chosen the classical definition which implies c^x is not defined for real r and negative c. Our implementation is based on the decomposition c^r = exp(r * ln(c)), so the NaN comes from the logarithm when c <= 0. Noe also that as explained in the documentation here: <http://commons.apache.org/proper/commons-math/userguide/analysis.html#a4.7_Differentiation>, there are no concepts of "constants" and "variables" in this framework, so we cannot draw a line between c^r as seen as a univariate function of r, or as a univariate function of c, or as a bivariate function of c and r, or even as a pentavariate function of p1, p2, p3, p4, p5 with both c and r being computed elsewhere from p1...p5. So we don't make special cases for the case c = 0 for example. Does this explanation make sense to you? best regards, Luc > final DerivativeStructure a_ds = new DerivativeStructure(1, 1, > a); > final DerivativeStructure out = a_ds.pow(dsA); > System.out.format("Derivative@%d=%f\n", a, > out.getPartialDerivative(new int[]{1})); > } > } > } > > > > On Fri, Aug 23, 2013 at 7:59 AM, Gilles <[hidden email]>wrote: > >> On Fri, 23 Aug 2013 07:17:35 -0700, Ajo Fod wrote: >> >>> Seems like the DerivativeCompiler returns NaN. >>> >>> IMHO it should return 0. >>> >> >> What should be 0? And Why? >> >> >> >>> Is this worthy of an issue? >>> >> >> As is, no. >> >> Gilles >> >> >>> Thanks, >>> -Ajo >>> >> >> >> ------------------------------**------------------------------**--------- >> To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org<[hidden email]> >> For additional commands, e-mail: [hidden email] >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
Hello,
This shows one way of interpreting the derivative for strictly +ve numbers. public static void main(final String[] args) { final double x = 1d; DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, x); System.out.println("Derivative of |a|^x wrt x"); for (int p = 10; p < 21; p++) { double a; if (p < 20) { a = 1d / Math.pow(2d, p); } else { a = 0d; } final DerivativeStructure a_ds = new DerivativeStructure(1, 1, a); final DerivativeStructure out = a_ds.pow(dsA); final double calc = (Math.pow(a, x + EPS) - Math.pow(a, x)) / EPS; System.out.format("Derivative@%f=%f %f\n", a, calc, out.getPartialDerivative(new int[]{1})); } } At this point I"m explicitly substituting the rule that derivative(|a|^x) = 0 for |a|=0. Thanks, Ajo. On Fri, Aug 23, 2013 at 9:39 AM, Luc Maisonobe <[hidden email]>wrote: > Hi Ajo, > > Le 23/08/2013 17:48, Ajo Fod a écrit : > > Try this and I'm happy to explain if necessary: > > > > public class Derivative { > > > > public static void main(final String[] args) { > > DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, 1d); > > System.out.println("Derivative of constant^x wrt x"); > > for (int a = -3; a < 3; a++) { > > We have chosen the classical definition which implies c^x is not defined > for real r and negative c. > > Our implementation is based on the decomposition c^r = exp(r * ln(c)), > so the NaN comes from the logarithm when c <= 0. > > Noe also that as explained in the documentation here: > < > http://commons.apache.org/proper/commons-math/userguide/analysis.html#a4.7_Differentiation > >, > there are no concepts of "constants" and "variables" in this framework, > so we cannot draw a line between c^r as seen as a univariate function of > r, or as a univariate function of c, or as a bivariate function of c and > r, or even as a pentavariate function of p1, p2, p3, p4, p5 with both c > and r being computed elsewhere from p1...p5. So we don't make special > cases for the case c = 0 for example. > > Does this explanation make sense to you? > > best regards, > Luc > > > > final DerivativeStructure a_ds = new DerivativeStructure(1, > 1, > > a); > > final DerivativeStructure out = a_ds.pow(dsA); > > System.out.format("Derivative@%d=%f\n", a, > > out.getPartialDerivative(new int[]{1})); > > } > > } > > } > > > > > > > > On Fri, Aug 23, 2013 at 7:59 AM, Gilles <[hidden email] > >wrote: > > > >> On Fri, 23 Aug 2013 07:17:35 -0700, Ajo Fod wrote: > >> > >>> Seems like the DerivativeCompiler returns NaN. > >>> > >>> IMHO it should return 0. > >>> > >> > >> What should be 0? And Why? > >> > >> > >> > >>> Is this worthy of an issue? > >>> > >> > >> As is, no. > >> > >> Gilles > >> > >> > >>> Thanks, > >>> -Ajo > >>> > >> > >> > >> > ------------------------------**------------------------------**--------- > >> To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org< > [hidden email]> > >> For additional commands, e-mail: [hidden email] > >> > >> > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [hidden email] > For additional commands, e-mail: [hidden email] > > |
Le 23/08/2013 19:20, Ajo Fod a écrit :
> Hello, Hi Ajo, > > This shows one way of interpreting the derivative for strictly +ve numbers. > > public static void main(final String[] args) { > final double x = 1d; > DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, x); > System.out.println("Derivative of |a|^x wrt x"); > for (int p = 10; p < 21; p++) { > double a; > if (p < 20) { > a = 1d / Math.pow(2d, p); > } else { > a = 0d; > } > final DerivativeStructure a_ds = new DerivativeStructure(1, 1, > a); > final DerivativeStructure out = a_ds.pow(dsA); > final double calc = (Math.pow(a, x + EPS) - Math.pow(a, x)) / > EPS; > System.out.format("Derivative@%f=%f %f\n", a, calc, > out.getPartialDerivative(new int[]{1})); > } > } > > At this point I"m explicitly substituting the rule that derivative(|a|^x) = > 0 for |a|=0. Yes, but this fails for x = 0, as the limit of the finite difference is -infinity and not 0. You can build your own function which explicitly assumes a is constant and takes care of special values as follows: public static DerivativeStructure aToX(final double a, final DerivativeStructure x) { final double lnA = (a == 0 && x.getValue() == 0) ? Double.NEGATIVE_INFINITY : FastMath.log(a); final double[] function = new double[1 + x.getOrder()]; function[0] = FastMath.pow(a, x.getValue()); for (int i = 1; i < function.length; ++i) { function[i] = lnA * function[i - 1]; } return x.compose(function); } This will work and provides derivatives to any order for almost any values of a and x, including a=0, x=1 as in your exemple, but also slightly better for a=0, x=0. However, it still has an important drawback: it won't compute the n-th order derivative correctly for a=0, x=0 and n > 1. It will provide NaN for these higher order derivatives instead of +/-infinity according to parity of n. This is a known problem that we already encountered when dealing with rootN. Here is an extract of a comment in the test case testRootNSingularity, where similar NaN appears instead of +/- infinity. The dsZero instance in the comment is simple the x parameter of the function, as a derivativeStructure with value 0.0 and depending on itself (dsZero = new DerivativeStructure(1, maxOrder, 0, 0.0)): // the following checks shows a LIMITATION of the current implementation // we have no way to tell dsZero is a pure linear variable x = 0 // we only say: "dsZero is a structure with value = 0.0, // first derivative = 1.0, second and higher derivatives = 0.0". // Function composition rule for second derivatives is: // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) // when function f is the nth root and x = 0 we have: // f(0) = 0, f'(0) = +infinity, f''(0) = -infinity (and higher // derivatives keep switching between +infinity and -infinity) // so given that in our case dsZero represents g, we have g(x) = 0, // g'(x) = 1 and g''(x) = 0 // applying the composition rules gives: // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) // = -infinity * 1^2 + +infinity * 0 // = -infinity + NaN // = NaN // if we knew dsZero is really the x variable and not the identity // function applied to x, we would not have computed f'(g(x)) * g''(x) // and we would have found that the result was -infinity and not NaN Hope this helps Luc > > Thanks, > Ajo. > > > > On Fri, Aug 23, 2013 at 9:39 AM, Luc Maisonobe <[hidden email]>wrote: > >> Hi Ajo, >> >> Le 23/08/2013 17:48, Ajo Fod a écrit : >>> Try this and I'm happy to explain if necessary: >>> >>> public class Derivative { >>> >>> public static void main(final String[] args) { >>> DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, 1d); >>> System.out.println("Derivative of constant^x wrt x"); >>> for (int a = -3; a < 3; a++) { >> >> We have chosen the classical definition which implies c^x is not defined >> for real r and negative c. >> >> Our implementation is based on the decomposition c^r = exp(r * ln(c)), >> so the NaN comes from the logarithm when c <= 0. >> >> Noe also that as explained in the documentation here: >> < >> http://commons.apache.org/proper/commons-math/userguide/analysis.html#a4.7_Differentiation >>> , >> there are no concepts of "constants" and "variables" in this framework, >> so we cannot draw a line between c^r as seen as a univariate function of >> r, or as a univariate function of c, or as a bivariate function of c and >> r, or even as a pentavariate function of p1, p2, p3, p4, p5 with both c >> and r being computed elsewhere from p1...p5. So we don't make special >> cases for the case c = 0 for example. >> >> Does this explanation make sense to you? >> >> best regards, >> Luc >> >> >>> final DerivativeStructure a_ds = new DerivativeStructure(1, >> 1, >>> a); >>> final DerivativeStructure out = a_ds.pow(dsA); >>> System.out.format("Derivative@%d=%f\n", a, >>> out.getPartialDerivative(new int[]{1})); >>> } >>> } >>> } >>> >>> >>> >>> On Fri, Aug 23, 2013 at 7:59 AM, Gilles <[hidden email] >>> wrote: >>> >>>> On Fri, 23 Aug 2013 07:17:35 -0700, Ajo Fod wrote: >>>> >>>>> Seems like the DerivativeCompiler returns NaN. >>>>> >>>>> IMHO it should return 0. >>>>> >>>> >>>> What should be 0? And Why? >>>> >>>> >>>> >>>>> Is this worthy of an issue? >>>>> >>>> >>>> As is, no. >>>> >>>> Gilles >>>> >>>> >>>>> Thanks, >>>>> -Ajo >>>>> >>>> >>>> >>>> >> ------------------------------**------------------------------**--------- >>>> To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org< >> [hidden email]> >>>> For additional commands, e-mail: [hidden email] >>>> >>>> >>> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [hidden email] >> For additional commands, e-mail: [hidden email] >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
Le 24/08/2013 11:24, Luc Maisonobe a écrit :
> Le 23/08/2013 19:20, Ajo Fod a écrit : >> Hello, > > Hi Ajo, > >> >> This shows one way of interpreting the derivative for strictly +ve numbers. >> >> public static void main(final String[] args) { >> final double x = 1d; >> DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, x); >> System.out.println("Derivative of |a|^x wrt x"); >> for (int p = 10; p < 21; p++) { >> double a; >> if (p < 20) { >> a = 1d / Math.pow(2d, p); >> } else { >> a = 0d; >> } >> final DerivativeStructure a_ds = new DerivativeStructure(1, 1, >> a); >> final DerivativeStructure out = a_ds.pow(dsA); >> final double calc = (Math.pow(a, x + EPS) - Math.pow(a, x)) / >> EPS; >> System.out.format("Derivative@%f=%f %f\n", a, calc, >> out.getPartialDerivative(new int[]{1})); >> } >> } >> >> At this point I"m explicitly substituting the rule that derivative(|a|^x) = >> 0 for |a|=0. > > Yes, but this fails for x = 0, as the limit of the finite difference is > -infinity and not 0. > > You can build your own function which explicitly assumes a is constant > and takes care of special values as follows: > > public static DerivativeStructure aToX(final double a, > final DerivativeStructure x) { > final double lnA = (a == 0 && x.getValue() == 0) ? > Double.NEGATIVE_INFINITY : > FastMath.log(a); > final double[] function = new double[1 + x.getOrder()]; > function[0] = FastMath.pow(a, x.getValue()); > for (int i = 1; i < function.length; ++i) { > function[i] = lnA * function[i - 1]; > } > return x.compose(function); > } > > This will work and provides derivatives to any order for almost any > values of a and x, including a=0, x=1 as in your exemple, but also > slightly better for a=0, x=0. However, it still has an important > drawback: it won't compute the n-th order derivative correctly for a=0, > x=0 and n > 1. It will provide NaN for these higher order derivatives > instead of +/-infinity according to parity of n. I have added a similar function to the DerivativeStructure class (with some errors above corrected). The main interesting property of this function is that it is more accurate that converting a to a DerivativeStructure and using the general x^y function. It does its best to handle the special case, but as written above, this does NOT work for general combination (i.e. more than one variable or more than one order). As soon as there is a combination, the derivative will involve something like df/dx * dg/dy and as infinities and zeros are everywheren NaN appears immediately for these partial derivatives. This cannot be avoided. If you stay away from the singularity, the function behaves correctly. best regards, Luc > > This is a known problem that we already encountered when dealing with > rootN. Here is an extract of a comment in the test case > testRootNSingularity, where similar NaN appears instead of +/- infinity. > The dsZero instance in the comment is simple the x parameter of the > function, as a derivativeStructure with value 0.0 and depending on > itself (dsZero = new DerivativeStructure(1, maxOrder, 0, 0.0)): > > > // the following checks shows a LIMITATION of the current implementation > // we have no way to tell dsZero is a pure linear variable x = 0 > // we only say: "dsZero is a structure with value = 0.0, > // first derivative = 1.0, second and higher derivatives = 0.0". > // Function composition rule for second derivatives is: > // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) > // when function f is the nth root and x = 0 we have: > // f(0) = 0, f'(0) = +infinity, f''(0) = -infinity (and higher > // derivatives keep switching between +infinity and -infinity) > // so given that in our case dsZero represents g, we have g(x) = 0, > // g'(x) = 1 and g''(x) = 0 > // applying the composition rules gives: > // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) > // = -infinity * 1^2 + +infinity * 0 > // = -infinity + NaN > // = NaN > // if we knew dsZero is really the x variable and not the identity > // function applied to x, we would not have computed f'(g(x)) * g''(x) > // and we would have found that the result was -infinity and not NaN > > Hope this helps > Luc > >> >> Thanks, >> Ajo. >> >> >> >> On Fri, Aug 23, 2013 at 9:39 AM, Luc Maisonobe <[hidden email]>wrote: >> >>> Hi Ajo, >>> >>> Le 23/08/2013 17:48, Ajo Fod a écrit : >>>> Try this and I'm happy to explain if necessary: >>>> >>>> public class Derivative { >>>> >>>> public static void main(final String[] args) { >>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, 1d); >>>> System.out.println("Derivative of constant^x wrt x"); >>>> for (int a = -3; a < 3; a++) { >>> >>> We have chosen the classical definition which implies c^x is not defined >>> for real r and negative c. >>> >>> Our implementation is based on the decomposition c^r = exp(r * ln(c)), >>> so the NaN comes from the logarithm when c <= 0. >>> >>> Noe also that as explained in the documentation here: >>> < >>> http://commons.apache.org/proper/commons-math/userguide/analysis.html#a4.7_Differentiation >>>> , >>> there are no concepts of "constants" and "variables" in this framework, >>> so we cannot draw a line between c^r as seen as a univariate function of >>> r, or as a univariate function of c, or as a bivariate function of c and >>> r, or even as a pentavariate function of p1, p2, p3, p4, p5 with both c >>> and r being computed elsewhere from p1...p5. So we don't make special >>> cases for the case c = 0 for example. >>> >>> Does this explanation make sense to you? >>> >>> best regards, >>> Luc >>> >>> >>>> final DerivativeStructure a_ds = new DerivativeStructure(1, >>> 1, >>>> a); >>>> final DerivativeStructure out = a_ds.pow(dsA); >>>> System.out.format("Derivative@%d=%f\n", a, >>>> out.getPartialDerivative(new int[]{1})); >>>> } >>>> } >>>> } >>>> >>>> >>>> >>>> On Fri, Aug 23, 2013 at 7:59 AM, Gilles <[hidden email] >>>> wrote: >>>> >>>>> On Fri, 23 Aug 2013 07:17:35 -0700, Ajo Fod wrote: >>>>> >>>>>> Seems like the DerivativeCompiler returns NaN. >>>>>> >>>>>> IMHO it should return 0. >>>>>> >>>>> >>>>> What should be 0? And Why? >>>>> >>>>> >>>>> >>>>>> Is this worthy of an issue? >>>>>> >>>>> >>>>> As is, no. >>>>> >>>>> Gilles >>>>> >>>>> >>>>>> Thanks, >>>>>> -Ajo >>>>>> >>>>> >>>>> >>>>> >>> ------------------------------**------------------------------**--------- >>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org< >>> [hidden email]> >>>>> For additional commands, e-mail: [hidden email] >>>>> >>>>> >>>> >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [hidden email] >>> For additional commands, e-mail: [hidden email] >>> >>> >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [hidden email] > For additional commands, e-mail: [hidden email] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
Are you saying patched the code? Can you provide the link?
-Ajo On Sun, Aug 25, 2013 at 1:20 PM, Luc Maisonobe <[hidden email]> wrote: > Le 24/08/2013 11:24, Luc Maisonobe a écrit : > > Le 23/08/2013 19:20, Ajo Fod a écrit : > >> Hello, > > > > Hi Ajo, > > > >> > >> This shows one way of interpreting the derivative for strictly +ve > numbers. > >> > >> public static void main(final String[] args) { > >> final double x = 1d; > >> DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, x); > >> System.out.println("Derivative of |a|^x wrt x"); > >> for (int p = 10; p < 21; p++) { > >> double a; > >> if (p < 20) { > >> a = 1d / Math.pow(2d, p); > >> } else { > >> a = 0d; > >> } > >> final DerivativeStructure a_ds = new DerivativeStructure(1, > 1, > >> a); > >> final DerivativeStructure out = a_ds.pow(dsA); > >> final double calc = (Math.pow(a, x + EPS) - Math.pow(a, x)) > / > >> EPS; > >> System.out.format("Derivative@%f=%f %f\n", a, calc, > >> out.getPartialDerivative(new int[]{1})); > >> } > >> } > >> > >> At this point I"m explicitly substituting the rule that > derivative(|a|^x) = > >> 0 for |a|=0. > > > > Yes, but this fails for x = 0, as the limit of the finite difference is > > -infinity and not 0. > > > > You can build your own function which explicitly assumes a is constant > > and takes care of special values as follows: > > > > public static DerivativeStructure aToX(final double a, > > final DerivativeStructure x) { > > final double lnA = (a == 0 && x.getValue() == 0) ? > > Double.NEGATIVE_INFINITY : > > FastMath.log(a); > > final double[] function = new double[1 + x.getOrder()]; > > function[0] = FastMath.pow(a, x.getValue()); > > for (int i = 1; i < function.length; ++i) { > > function[i] = lnA * function[i - 1]; > > } > > return x.compose(function); > > } > > > > This will work and provides derivatives to any order for almost any > > values of a and x, including a=0, x=1 as in your exemple, but also > > slightly better for a=0, x=0. However, it still has an important > > drawback: it won't compute the n-th order derivative correctly for a=0, > > x=0 and n > 1. It will provide NaN for these higher order derivatives > > instead of +/-infinity according to parity of n. > > I have added a similar function to the DerivativeStructure class (with > some errors above corrected). The main interesting property of this > function is that it is more accurate that converting a to a > DerivativeStructure and using the general x^y function. It does its best > to handle the special case, but as written above, this does NOT work for > general combination (i.e. more than one variable or more than one > order). As soon as there is a combination, the derivative will involve > something like df/dx * dg/dy and as infinities and zeros are everywheren > NaN appears immediately for these partial derivatives. This cannot be > avoided. > > If you stay away from the singularity, the function behaves correctly. > > best regards, > Luc > > > > > This is a known problem that we already encountered when dealing with > > rootN. Here is an extract of a comment in the test case > > testRootNSingularity, where similar NaN appears instead of +/- infinity. > > The dsZero instance in the comment is simple the x parameter of the > > function, as a derivativeStructure with value 0.0 and depending on > > itself (dsZero = new DerivativeStructure(1, maxOrder, 0, 0.0)): > > > > > > // the following checks shows a LIMITATION of the current implementation > > // we have no way to tell dsZero is a pure linear variable x = 0 > > // we only say: "dsZero is a structure with value = 0.0, > > // first derivative = 1.0, second and higher derivatives = 0.0". > > // Function composition rule for second derivatives is: > > // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) > > // when function f is the nth root and x = 0 we have: > > // f(0) = 0, f'(0) = +infinity, f''(0) = -infinity (and higher > > // derivatives keep switching between +infinity and -infinity) > > // so given that in our case dsZero represents g, we have g(x) = 0, > > // g'(x) = 1 and g''(x) = 0 > > // applying the composition rules gives: > > // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) > > // = -infinity * 1^2 + +infinity * 0 > > // = -infinity + NaN > > // = NaN > > // if we knew dsZero is really the x variable and not the identity > > // function applied to x, we would not have computed f'(g(x)) * g''(x) > > // and we would have found that the result was -infinity and not NaN > > > > Hope this helps > > Luc > > > >> > >> Thanks, > >> Ajo. > >> > >> > >> > >> On Fri, Aug 23, 2013 at 9:39 AM, Luc Maisonobe <[hidden email] > >wrote: > >> > >>> Hi Ajo, > >>> > >>> Le 23/08/2013 17:48, Ajo Fod a écrit : > >>>> Try this and I'm happy to explain if necessary: > >>>> > >>>> public class Derivative { > >>>> > >>>> public static void main(final String[] args) { > >>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, > 1d); > >>>> System.out.println("Derivative of constant^x wrt x"); > >>>> for (int a = -3; a < 3; a++) { > >>> > >>> We have chosen the classical definition which implies c^x is not > defined > >>> for real r and negative c. > >>> > >>> Our implementation is based on the decomposition c^r = exp(r * ln(c)), > >>> so the NaN comes from the logarithm when c <= 0. > >>> > >>> Noe also that as explained in the documentation here: > >>> < > >>> > http://commons.apache.org/proper/commons-math/userguide/analysis.html#a4.7_Differentiation > >>>> , > >>> there are no concepts of "constants" and "variables" in this framework, > >>> so we cannot draw a line between c^r as seen as a univariate function > of > >>> r, or as a univariate function of c, or as a bivariate function of c > and > >>> r, or even as a pentavariate function of p1, p2, p3, p4, p5 with both c > >>> and r being computed elsewhere from p1...p5. So we don't make special > >>> cases for the case c = 0 for example. > >>> > >>> Does this explanation make sense to you? > >>> > >>> best regards, > >>> Luc > >>> > >>> > >>>> final DerivativeStructure a_ds = new > DerivativeStructure(1, > >>> 1, > >>>> a); > >>>> final DerivativeStructure out = a_ds.pow(dsA); > >>>> System.out.format("Derivative@%d=%f\n", a, > >>>> out.getPartialDerivative(new int[]{1})); > >>>> } > >>>> } > >>>> } > >>>> > >>>> > >>>> > >>>> On Fri, Aug 23, 2013 at 7:59 AM, Gilles <[hidden email] > >>>> wrote: > >>>> > >>>>> On Fri, 23 Aug 2013 07:17:35 -0700, Ajo Fod wrote: > >>>>> > >>>>>> Seems like the DerivativeCompiler returns NaN. > >>>>>> > >>>>>> IMHO it should return 0. > >>>>>> > >>>>> > >>>>> What should be 0? And Why? > >>>>> > >>>>> > >>>>> > >>>>>> Is this worthy of an issue? > >>>>>> > >>>>> > >>>>> As is, no. > >>>>> > >>>>> Gilles > >>>>> > >>>>> > >>>>>> Thanks, > >>>>>> -Ajo > >>>>>> > >>>>> > >>>>> > >>>>> > >>> > ------------------------------**------------------------------**--------- > >>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org< > >>> [hidden email]> > >>>>> For additional commands, e-mail: [hidden email] > >>>>> > >>>>> > >>>> > >>> > >>> > >>> --------------------------------------------------------------------- > >>> To unsubscribe, e-mail: [hidden email] > >>> For additional commands, e-mail: [hidden email] > >>> > >>> > >> > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [hidden email] > > For additional commands, e-mail: [hidden email] > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [hidden email] > For additional commands, e-mail: [hidden email] > > |
Ajo Fod <[hidden email]> a écrit : >Are you saying patched the code? Can you provide the link? I committed it in the development version. You just have to update your checked out copy from either the official Apache subversion repository or the git mirror we talked about in a previous thread. The new method is a static one called pow and taking a and x as arguments and returning a^x. Not to Be confused with the non-static methods that take only the power as argument (either int, double or DerivativeStructure) and use the instance as the base to apply power on. Best regards, Luc > >-Ajo > > >On Sun, Aug 25, 2013 at 1:20 PM, Luc Maisonobe <[hidden email]> >wrote: > >> Le 24/08/2013 11:24, Luc Maisonobe a écrit : >> > Le 23/08/2013 19:20, Ajo Fod a écrit : >> >> Hello, >> > >> > Hi Ajo, >> > >> >> >> >> This shows one way of interpreting the derivative for strictly +ve >> numbers. >> >> >> >> public static void main(final String[] args) { >> >> final double x = 1d; >> >> DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, >x); >> >> System.out.println("Derivative of |a|^x wrt x"); >> >> for (int p = 10; p < 21; p++) { >> >> double a; >> >> if (p < 20) { >> >> a = 1d / Math.pow(2d, p); >> >> } else { >> >> a = 0d; >> >> } >> >> final DerivativeStructure a_ds = new >DerivativeStructure(1, >> 1, >> >> a); >> >> final DerivativeStructure out = a_ds.pow(dsA); >> >> final double calc = (Math.pow(a, x + EPS) - >Math.pow(a, x)) >> / >> >> EPS; >> >> System.out.format("Derivative@%f=%f %f\n", a, calc, >> >> out.getPartialDerivative(new int[]{1})); >> >> } >> >> } >> >> >> >> At this point I"m explicitly substituting the rule that >> derivative(|a|^x) = >> >> 0 for |a|=0. >> > >> > Yes, but this fails for x = 0, as the limit of the finite >difference is >> > -infinity and not 0. >> > >> > You can build your own function which explicitly assumes a is >constant >> > and takes care of special values as follows: >> > >> > public static DerivativeStructure aToX(final double a, >> > final DerivativeStructure >x) { >> > final double lnA = (a == 0 && x.getValue() == 0) ? >> > Double.NEGATIVE_INFINITY : >> > FastMath.log(a); >> > final double[] function = new double[1 + x.getOrder()]; >> > function[0] = FastMath.pow(a, x.getValue()); >> > for (int i = 1; i < function.length; ++i) { >> > function[i] = lnA * function[i - 1]; >> > } >> > return x.compose(function); >> > } >> > >> > This will work and provides derivatives to any order for almost any >> > values of a and x, including a=0, x=1 as in your exemple, but also >> > slightly better for a=0, x=0. However, it still has an important >> > drawback: it won't compute the n-th order derivative correctly for >a=0, >> > x=0 and n > 1. It will provide NaN for these higher order >derivatives >> > instead of +/-infinity according to parity of n. >> >> I have added a similar function to the DerivativeStructure class >(with >> some errors above corrected). The main interesting property of this >> function is that it is more accurate that converting a to a >> DerivativeStructure and using the general x^y function. It does its >best >> to handle the special case, but as written above, this does NOT work >for >> general combination (i.e. more than one variable or more than one >> order). As soon as there is a combination, the derivative will >involve >> something like df/dx * dg/dy and as infinities and zeros are >everywheren >> NaN appears immediately for these partial derivatives. This cannot be >> avoided. >> >> If you stay away from the singularity, the function behaves >correctly. >> >> best regards, >> Luc >> >> > >> > This is a known problem that we already encountered when dealing >with >> > rootN. Here is an extract of a comment in the test case >> > testRootNSingularity, where similar NaN appears instead of +/- >infinity. >> > The dsZero instance in the comment is simple the x parameter of the >> > function, as a derivativeStructure with value 0.0 and depending on >> > itself (dsZero = new DerivativeStructure(1, maxOrder, 0, 0.0)): >> > >> > >> > // the following checks shows a LIMITATION of the current >implementation >> > // we have no way to tell dsZero is a pure linear variable x = 0 >> > // we only say: "dsZero is a structure with value = 0.0, >> > // first derivative = 1.0, second and higher derivatives = 0.0". >> > // Function composition rule for second derivatives is: >> > // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) >> > // when function f is the nth root and x = 0 we have: >> > // f(0) = 0, f'(0) = +infinity, f''(0) = -infinity (and higher >> > // derivatives keep switching between +infinity and -infinity) >> > // so given that in our case dsZero represents g, we have g(x) = 0, >> > // g'(x) = 1 and g''(x) = 0 >> > // applying the composition rules gives: >> > // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) >> > // = -infinity * 1^2 + +infinity * 0 >> > // = -infinity + NaN >> > // = NaN >> > // if we knew dsZero is really the x variable and not the identity >> > // function applied to x, we would not have computed f'(g(x)) * >g''(x) >> > // and we would have found that the result was -infinity and not >NaN >> > >> > Hope this helps >> > Luc >> > >> >> >> >> Thanks, >> >> Ajo. >> >> >> >> >> >> >> >> On Fri, Aug 23, 2013 at 9:39 AM, Luc Maisonobe ><[hidden email] >> >wrote: >> >> >> >>> Hi Ajo, >> >>> >> >>> Le 23/08/2013 17:48, Ajo Fod a écrit : >> >>>> Try this and I'm happy to explain if necessary: >> >>>> >> >>>> public class Derivative { >> >>>> >> >>>> public static void main(final String[] args) { >> >>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, >0, >> 1d); >> >>>> System.out.println("Derivative of constant^x wrt x"); >> >>>> for (int a = -3; a < 3; a++) { >> >>> >> >>> We have chosen the classical definition which implies c^x is not >> defined >> >>> for real r and negative c. >> >>> >> >>> Our implementation is based on the decomposition c^r = exp(r * >ln(c)), >> >>> so the NaN comes from the logarithm when c <= 0. >> >>> >> >>> Noe also that as explained in the documentation here: >> >>> < >> >>> >> >http://commons.apache.org/proper/commons-math/userguide/analysis.html#a4.7_Differentiation >> >>>> , >> >>> there are no concepts of "constants" and "variables" in this >framework, >> >>> so we cannot draw a line between c^r as seen as a univariate >function >> of >> >>> r, or as a univariate function of c, or as a bivariate function >of c >> and >> >>> r, or even as a pentavariate function of p1, p2, p3, p4, p5 with >both c >> >>> and r being computed elsewhere from p1...p5. So we don't make >special >> >>> cases for the case c = 0 for example. >> >>> >> >>> Does this explanation make sense to you? >> >>> >> >>> best regards, >> >>> Luc >> >>> >> >>> >> >>>> final DerivativeStructure a_ds = new >> DerivativeStructure(1, >> >>> 1, >> >>>> a); >> >>>> final DerivativeStructure out = a_ds.pow(dsA); >> >>>> System.out.format("Derivative@%d=%f\n", a, >> >>>> out.getPartialDerivative(new int[]{1})); >> >>>> } >> >>>> } >> >>>> } >> >>>> >> >>>> >> >>>> >> >>>> On Fri, Aug 23, 2013 at 7:59 AM, Gilles ><[hidden email] >> >>>> wrote: >> >>>> >> >>>>> On Fri, 23 Aug 2013 07:17:35 -0700, Ajo Fod wrote: >> >>>>> >> >>>>>> Seems like the DerivativeCompiler returns NaN. >> >>>>>> >> >>>>>> IMHO it should return 0. >> >>>>>> >> >>>>> >> >>>>> What should be 0? And Why? >> >>>>> >> >>>>> >> >>>>> >> >>>>>> Is this worthy of an issue? >> >>>>>> >> >>>>> >> >>>>> As is, no. >> >>>>> >> >>>>> Gilles >> >>>>> >> >>>>> >> >>>>>> Thanks, >> >>>>>> -Ajo >> >>>>>> >> >>>>> >> >>>>> >> >>>>> >> >>> >> >------------------------------**------------------------------**--------- >> >>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org< >> >>> [hidden email]> >> >>>>> For additional commands, e-mail: [hidden email] >> >>>>> >> >>>>> >> >>>> >> >>> >> >>> >> >>> >--------------------------------------------------------------------- >> >>> To unsubscribe, e-mail: [hidden email] >> >>> For additional commands, e-mail: [hidden email] >> >>> >> >>> >> >> >> > >> > >> > >--------------------------------------------------------------------- >> > To unsubscribe, e-mail: [hidden email] >> > For additional commands, e-mail: [hidden email] >> > >> > >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [hidden email] >> For additional commands, e-mail: [hidden email] >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
With regards to what is happening in DsCompiler.pow():
IMHO, when a==0 and x>=0 the function is well behaved because log|a| -> Inf slower than a^x -> 0. I got to this by simulation. One could probably get to something more conclusive using L'Hopital rule : http://en.wikipedia.org/wiki/L%27H%C3%B4pital%27s_rule. There is one about xlog(x) behavior as x->0+. So, I propose this: if (a == 0) { if (operand[operandOffset] >= 0) { for (int i = 0; i < function.length; ++i) { function[i] = 0; } }else{ for (int i = 0; i < function.length; ++i) { function[i] = Double.NaN; } } } else { in place of : if (a == 0) { if (operand[operandOffset] == 0) { function[0] = 1; double infinity = Double.POSITIVE_INFINITY; for (int i = 1; i < function.length; ++i) { infinity = -infinity; function[i] = infinity; } } } else { PS: I think you made a change to DSCompiler.pow too. If so, what happens when a=0 & x!=0 in that function? On Mon, Aug 26, 2013 at 12:38 AM, Luc Maisonobe <[hidden email]> wrote: > > > > Ajo Fod <[hidden email]> a écrit : > >Are you saying patched the code? Can you provide the link? > > I committed it in the development version. You just have to update your > checked out copy from either the official > Apache subversion repository or the git mirror we talked about in a > previous thread. > > The new method is a static one called pow and taking a and x as arguments > and returning a^x. Not to > Be confused with the non-static methods that take only the power as > argument (either int, double or > DerivativeStructure) and use the instance as the base to apply power on. > > Best regards, > Luc > > > > >-Ajo > > > > > >On Sun, Aug 25, 2013 at 1:20 PM, Luc Maisonobe <[hidden email]> > >wrote: > > > >> Le 24/08/2013 11:24, Luc Maisonobe a écrit : > >> > Le 23/08/2013 19:20, Ajo Fod a écrit : > >> >> Hello, > >> > > >> > Hi Ajo, > >> > > >> >> > >> >> This shows one way of interpreting the derivative for strictly +ve > >> numbers. > >> >> > >> >> public static void main(final String[] args) { > >> >> final double x = 1d; > >> >> DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, > >x); > >> >> System.out.println("Derivative of |a|^x wrt x"); > >> >> for (int p = 10; p < 21; p++) { > >> >> double a; > >> >> if (p < 20) { > >> >> a = 1d / Math.pow(2d, p); > >> >> } else { > >> >> a = 0d; > >> >> } > >> >> final DerivativeStructure a_ds = new > >DerivativeStructure(1, > >> 1, > >> >> a); > >> >> final DerivativeStructure out = a_ds.pow(dsA); > >> >> final double calc = (Math.pow(a, x + EPS) - > >Math.pow(a, x)) > >> / > >> >> EPS; > >> >> System.out.format("Derivative@%f=%f %f\n", a, calc, > >> >> out.getPartialDerivative(new int[]{1})); > >> >> } > >> >> } > >> >> > >> >> At this point I"m explicitly substituting the rule that > >> derivative(|a|^x) = > >> >> 0 for |a|=0. > >> > > >> > Yes, but this fails for x = 0, as the limit of the finite > >difference is > >> > -infinity and not 0. > >> > > >> > You can build your own function which explicitly assumes a is > >constant > >> > and takes care of special values as follows: > >> > > >> > public static DerivativeStructure aToX(final double a, > >> > final DerivativeStructure > >x) { > >> > final double lnA = (a == 0 && x.getValue() == 0) ? > >> > Double.NEGATIVE_INFINITY : > >> > FastMath.log(a); > >> > final double[] function = new double[1 + x.getOrder()]; > >> > function[0] = FastMath.pow(a, x.getValue()); > >> > for (int i = 1; i < function.length; ++i) { > >> > function[i] = lnA * function[i - 1]; > >> > } > >> > return x.compose(function); > >> > } > >> > > >> > This will work and provides derivatives to any order for almost any > >> > values of a and x, including a=0, x=1 as in your exemple, but also > >> > slightly better for a=0, x=0. However, it still has an important > >> > drawback: it won't compute the n-th order derivative correctly for > >a=0, > >> > x=0 and n > 1. It will provide NaN for these higher order > >derivatives > >> > instead of +/-infinity according to parity of n. > >> > >> I have added a similar function to the DerivativeStructure class > >(with > >> some errors above corrected). The main interesting property of this > >> function is that it is more accurate that converting a to a > >> DerivativeStructure and using the general x^y function. It does its > >best > >> to handle the special case, but as written above, this does NOT work > >for > >> general combination (i.e. more than one variable or more than one > >> order). As soon as there is a combination, the derivative will > >involve > >> something like df/dx * dg/dy and as infinities and zeros are > >everywheren > >> NaN appears immediately for these partial derivatives. This cannot be > >> avoided. > >> > >> If you stay away from the singularity, the function behaves > >correctly. > >> > >> best regards, > >> Luc > >> > >> > > >> > This is a known problem that we already encountered when dealing > >with > >> > rootN. Here is an extract of a comment in the test case > >> > testRootNSingularity, where similar NaN appears instead of +/- > >infinity. > >> > The dsZero instance in the comment is simple the x parameter of the > >> > function, as a derivativeStructure with value 0.0 and depending on > >> > itself (dsZero = new DerivativeStructure(1, maxOrder, 0, 0.0)): > >> > > >> > > >> > // the following checks shows a LIMITATION of the current > >implementation > >> > // we have no way to tell dsZero is a pure linear variable x = 0 > >> > // we only say: "dsZero is a structure with value = 0.0, > >> > // first derivative = 1.0, second and higher derivatives = 0.0". > >> > // Function composition rule for second derivatives is: > >> > // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) > >> > // when function f is the nth root and x = 0 we have: > >> > // f(0) = 0, f'(0) = +infinity, f''(0) = -infinity (and higher > >> > // derivatives keep switching between +infinity and -infinity) > >> > // so given that in our case dsZero represents g, we have g(x) = 0, > >> > // g'(x) = 1 and g''(x) = 0 > >> > // applying the composition rules gives: > >> > // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) > >> > // = -infinity * 1^2 + +infinity * 0 > >> > // = -infinity + NaN > >> > // = NaN > >> > // if we knew dsZero is really the x variable and not the identity > >> > // function applied to x, we would not have computed f'(g(x)) * > >g''(x) > >> > // and we would have found that the result was -infinity and not > >NaN > >> > > >> > Hope this helps > >> > Luc > >> > > >> >> > >> >> Thanks, > >> >> Ajo. > >> >> > >> >> > >> >> > >> >> On Fri, Aug 23, 2013 at 9:39 AM, Luc Maisonobe > ><[hidden email] > >> >wrote: > >> >> > >> >>> Hi Ajo, > >> >>> > >> >>> Le 23/08/2013 17:48, Ajo Fod a écrit : > >> >>>> Try this and I'm happy to explain if necessary: > >> >>>> > >> >>>> public class Derivative { > >> >>>> > >> >>>> public static void main(final String[] args) { > >> >>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, > >0, > >> 1d); > >> >>>> System.out.println("Derivative of constant^x wrt x"); > >> >>>> for (int a = -3; a < 3; a++) { > >> >>> > >> >>> We have chosen the classical definition which implies c^x is not > >> defined > >> >>> for real r and negative c. > >> >>> > >> >>> Our implementation is based on the decomposition c^r = exp(r * > >ln(c)), > >> >>> so the NaN comes from the logarithm when c <= 0. > >> >>> > >> >>> Noe also that as explained in the documentation here: > >> >>> < > >> >>> > >> > > > http://commons.apache.org/proper/commons-math/userguide/analysis.html#a4.7_Differentiation > >> >>>> , > >> >>> there are no concepts of "constants" and "variables" in this > >framework, > >> >>> so we cannot draw a line between c^r as seen as a univariate > >function > >> of > >> >>> r, or as a univariate function of c, or as a bivariate function > >of c > >> and > >> >>> r, or even as a pentavariate function of p1, p2, p3, p4, p5 with > >both c > >> >>> and r being computed elsewhere from p1...p5. So we don't make > >special > >> >>> cases for the case c = 0 for example. > >> >>> > >> >>> Does this explanation make sense to you? > >> >>> > >> >>> best regards, > >> >>> Luc > >> >>> > >> >>> > >> >>>> final DerivativeStructure a_ds = new > >> DerivativeStructure(1, > >> >>> 1, > >> >>>> a); > >> >>>> final DerivativeStructure out = a_ds.pow(dsA); > >> >>>> System.out.format("Derivative@%d=%f\n", a, > >> >>>> out.getPartialDerivative(new int[]{1})); > >> >>>> } > >> >>>> } > >> >>>> } > >> >>>> > >> >>>> > >> >>>> > >> >>>> On Fri, Aug 23, 2013 at 7:59 AM, Gilles > ><[hidden email] > >> >>>> wrote: > >> >>>> > >> >>>>> On Fri, 23 Aug 2013 07:17:35 -0700, Ajo Fod wrote: > >> >>>>> > >> >>>>>> Seems like the DerivativeCompiler returns NaN. > >> >>>>>> > >> >>>>>> IMHO it should return 0. > >> >>>>>> > >> >>>>> > >> >>>>> What should be 0? And Why? > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>>> Is this worthy of an issue? > >> >>>>>> > >> >>>>> > >> >>>>> As is, no. > >> >>>>> > >> >>>>> Gilles > >> >>>>> > >> >>>>> > >> >>>>>> Thanks, > >> >>>>>> -Ajo > >> >>>>>> > >> >>>>> > >> >>>>> > >> >>>>> > >> >>> > >> > >------------------------------**------------------------------**--------- > >> >>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org< > >> >>> [hidden email]> > >> >>>>> For additional commands, e-mail: [hidden email] > >> >>>>> > >> >>>>> > >> >>>> > >> >>> > >> >>> > >> >>> > >--------------------------------------------------------------------- > >> >>> To unsubscribe, e-mail: [hidden email] > >> >>> For additional commands, e-mail: [hidden email] > >> >>> > >> >>> > >> >> > >> > > >> > > >> > > >--------------------------------------------------------------------- > >> > To unsubscribe, e-mail: [hidden email] > >> > For additional commands, e-mail: [hidden email] > >> > > >> > > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [hidden email] > >> For additional commands, e-mail: [hidden email] > >> > >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [hidden email] > For additional commands, e-mail: [hidden email] > > |
On a side note. Given a derivative structure ds. Wouldn't it be nice to
generate a constant derivative structure with something like: ds.getConstant(dobule value); Currently I"m doing something like: new DerivativeStructure(length, order, value); ... seesm more verbose than necessary when I have order and length information in existing ds all around. Cheers, Ajo. On Mon, Aug 26, 2013 at 8:23 AM, Ajo Fod <[hidden email]> wrote: > With regards to what is happening in DsCompiler.pow(): > IMHO, when a==0 and x>=0 the function is well behaved because log|a| -> > Inf slower than a^x -> 0. I got to this by simulation. > One could probably get to something more conclusive using L'Hopital rule : > http://en.wikipedia.org/wiki/L%27H%C3%B4pital%27s_rule. > There is one about xlog(x) behavior as x->0+. > > So, I propose this: > > if (a == 0) { > if (operand[operandOffset] >= 0) { > > for (int i = 0; i < function.length; ++i) { > function[i] = 0; > } > }else{ > > for (int i = 0; i < function.length; ++i) { > function[i] = Double.NaN; > } > } > } else { > > > in place of : > > if (a == 0) { > if (operand[operandOffset] == 0) { > function[0] = 1; > double infinity = Double.POSITIVE_INFINITY; > > for (int i = 1; i < function.length; ++i) { > infinity = -infinity; > function[i] = infinity; > } > } > } else { > > > PS: I think you made a change to DSCompiler.pow too. If so, what happens > when a=0 & x!=0 in that function? > > > On Mon, Aug 26, 2013 at 12:38 AM, Luc Maisonobe <[hidden email]>wrote: > >> >> >> >> Ajo Fod <[hidden email]> a écrit : >> >Are you saying patched the code? Can you provide the link? >> >> I committed it in the development version. You just have to update your >> checked out copy from either the official >> Apache subversion repository or the git mirror we talked about in a >> previous thread. >> >> The new method is a static one called pow and taking a and x as arguments >> and returning a^x. Not to >> Be confused with the non-static methods that take only the power as >> argument (either int, double or >> DerivativeStructure) and use the instance as the base to apply power on. >> >> Best regards, >> Luc >> >> > >> >-Ajo >> > >> > >> >On Sun, Aug 25, 2013 at 1:20 PM, Luc Maisonobe <[hidden email]> >> >wrote: >> > >> >> Le 24/08/2013 11:24, Luc Maisonobe a écrit : >> >> > Le 23/08/2013 19:20, Ajo Fod a écrit : >> >> >> Hello, >> >> > >> >> > Hi Ajo, >> >> > >> >> >> >> >> >> This shows one way of interpreting the derivative for strictly +ve >> >> numbers. >> >> >> >> >> >> public static void main(final String[] args) { >> >> >> final double x = 1d; >> >> >> DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, >> >x); >> >> >> System.out.println("Derivative of |a|^x wrt x"); >> >> >> for (int p = 10; p < 21; p++) { >> >> >> double a; >> >> >> if (p < 20) { >> >> >> a = 1d / Math.pow(2d, p); >> >> >> } else { >> >> >> a = 0d; >> >> >> } >> >> >> final DerivativeStructure a_ds = new >> >DerivativeStructure(1, >> >> 1, >> >> >> a); >> >> >> final DerivativeStructure out = a_ds.pow(dsA); >> >> >> final double calc = (Math.pow(a, x + EPS) - >> >Math.pow(a, x)) >> >> / >> >> >> EPS; >> >> >> System.out.format("Derivative@%f=%f %f\n", a, calc, >> >> >> out.getPartialDerivative(new int[]{1})); >> >> >> } >> >> >> } >> >> >> >> >> >> At this point I"m explicitly substituting the rule that >> >> derivative(|a|^x) = >> >> >> 0 for |a|=0. >> >> > >> >> > Yes, but this fails for x = 0, as the limit of the finite >> >difference is >> >> > -infinity and not 0. >> >> > >> >> > You can build your own function which explicitly assumes a is >> >constant >> >> > and takes care of special values as follows: >> >> > >> >> > public static DerivativeStructure aToX(final double a, >> >> > final DerivativeStructure >> >x) { >> >> > final double lnA = (a == 0 && x.getValue() == 0) ? >> >> > Double.NEGATIVE_INFINITY : >> >> > FastMath.log(a); >> >> > final double[] function = new double[1 + x.getOrder()]; >> >> > function[0] = FastMath.pow(a, x.getValue()); >> >> > for (int i = 1; i < function.length; ++i) { >> >> > function[i] = lnA * function[i - 1]; >> >> > } >> >> > return x.compose(function); >> >> > } >> >> > >> >> > This will work and provides derivatives to any order for almost any >> >> > values of a and x, including a=0, x=1 as in your exemple, but also >> >> > slightly better for a=0, x=0. However, it still has an important >> >> > drawback: it won't compute the n-th order derivative correctly for >> >a=0, >> >> > x=0 and n > 1. It will provide NaN for these higher order >> >derivatives >> >> > instead of +/-infinity according to parity of n. >> >> >> >> I have added a similar function to the DerivativeStructure class >> >(with >> >> some errors above corrected). The main interesting property of this >> >> function is that it is more accurate that converting a to a >> >> DerivativeStructure and using the general x^y function. It does its >> >best >> >> to handle the special case, but as written above, this does NOT work >> >for >> >> general combination (i.e. more than one variable or more than one >> >> order). As soon as there is a combination, the derivative will >> >involve >> >> something like df/dx * dg/dy and as infinities and zeros are >> >everywheren >> >> NaN appears immediately for these partial derivatives. This cannot be >> >> avoided. >> >> >> >> If you stay away from the singularity, the function behaves >> >correctly. >> >> >> >> best regards, >> >> Luc >> >> >> >> > >> >> > This is a known problem that we already encountered when dealing >> >with >> >> > rootN. Here is an extract of a comment in the test case >> >> > testRootNSingularity, where similar NaN appears instead of +/- >> >infinity. >> >> > The dsZero instance in the comment is simple the x parameter of the >> >> > function, as a derivativeStructure with value 0.0 and depending on >> >> > itself (dsZero = new DerivativeStructure(1, maxOrder, 0, 0.0)): >> >> > >> >> > >> >> > // the following checks shows a LIMITATION of the current >> >implementation >> >> > // we have no way to tell dsZero is a pure linear variable x = 0 >> >> > // we only say: "dsZero is a structure with value = 0.0, >> >> > // first derivative = 1.0, second and higher derivatives = 0.0". >> >> > // Function composition rule for second derivatives is: >> >> > // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) >> >> > // when function f is the nth root and x = 0 we have: >> >> > // f(0) = 0, f'(0) = +infinity, f''(0) = -infinity (and higher >> >> > // derivatives keep switching between +infinity and -infinity) >> >> > // so given that in our case dsZero represents g, we have g(x) = 0, >> >> > // g'(x) = 1 and g''(x) = 0 >> >> > // applying the composition rules gives: >> >> > // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) >> >> > // = -infinity * 1^2 + +infinity * 0 >> >> > // = -infinity + NaN >> >> > // = NaN >> >> > // if we knew dsZero is really the x variable and not the identity >> >> > // function applied to x, we would not have computed f'(g(x)) * >> >g''(x) >> >> > // and we would have found that the result was -infinity and not >> >NaN >> >> > >> >> > Hope this helps >> >> > Luc >> >> > >> >> >> >> >> >> Thanks, >> >> >> Ajo. >> >> >> >> >> >> >> >> >> >> >> >> On Fri, Aug 23, 2013 at 9:39 AM, Luc Maisonobe >> ><[hidden email] >> >> >wrote: >> >> >> >> >> >>> Hi Ajo, >> >> >>> >> >> >>> Le 23/08/2013 17:48, Ajo Fod a écrit : >> >> >>>> Try this and I'm happy to explain if necessary: >> >> >>>> >> >> >>>> public class Derivative { >> >> >>>> >> >> >>>> public static void main(final String[] args) { >> >> >>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, >> >0, >> >> 1d); >> >> >>>> System.out.println("Derivative of constant^x wrt x"); >> >> >>>> for (int a = -3; a < 3; a++) { >> >> >>> >> >> >>> We have chosen the classical definition which implies c^x is not >> >> defined >> >> >>> for real r and negative c. >> >> >>> >> >> >>> Our implementation is based on the decomposition c^r = exp(r * >> >ln(c)), >> >> >>> so the NaN comes from the logarithm when c <= 0. >> >> >>> >> >> >>> Noe also that as explained in the documentation here: >> >> >>> < >> >> >>> >> >> >> > >> http://commons.apache.org/proper/commons-math/userguide/analysis.html#a4.7_Differentiation >> >> >>>> , >> >> >>> there are no concepts of "constants" and "variables" in this >> >framework, >> >> >>> so we cannot draw a line between c^r as seen as a univariate >> >function >> >> of >> >> >>> r, or as a univariate function of c, or as a bivariate function >> >of c >> >> and >> >> >>> r, or even as a pentavariate function of p1, p2, p3, p4, p5 with >> >both c >> >> >>> and r being computed elsewhere from p1...p5. So we don't make >> >special >> >> >>> cases for the case c = 0 for example. >> >> >>> >> >> >>> Does this explanation make sense to you? >> >> >>> >> >> >>> best regards, >> >> >>> Luc >> >> >>> >> >> >>> >> >> >>>> final DerivativeStructure a_ds = new >> >> DerivativeStructure(1, >> >> >>> 1, >> >> >>>> a); >> >> >>>> final DerivativeStructure out = a_ds.pow(dsA); >> >> >>>> System.out.format("Derivative@%d=%f\n", a, >> >> >>>> out.getPartialDerivative(new int[]{1})); >> >> >>>> } >> >> >>>> } >> >> >>>> } >> >> >>>> >> >> >>>> >> >> >>>> >> >> >>>> On Fri, Aug 23, 2013 at 7:59 AM, Gilles >> ><[hidden email] >> >> >>>> wrote: >> >> >>>> >> >> >>>>> On Fri, 23 Aug 2013 07:17:35 -0700, Ajo Fod wrote: >> >> >>>>> >> >> >>>>>> Seems like the DerivativeCompiler returns NaN. >> >> >>>>>> >> >> >>>>>> IMHO it should return 0. >> >> >>>>>> >> >> >>>>> >> >> >>>>> What should be 0? And Why? >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>>> Is this worthy of an issue? >> >> >>>>>> >> >> >>>>> >> >> >>>>> As is, no. >> >> >>>>> >> >> >>>>> Gilles >> >> >>>>> >> >> >>>>> >> >> >>>>>> Thanks, >> >> >>>>>> -Ajo >> >> >>>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>> >> >> >> >------------------------------**------------------------------**--------- >> >> >>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org< >> >> >>> [hidden email]> >> >> >>>>> For additional commands, e-mail: [hidden email] >> >> >>>>> >> >> >>>>> >> >> >>>> >> >> >>> >> >> >>> >> >> >>> >> >--------------------------------------------------------------------- >> >> >>> To unsubscribe, e-mail: [hidden email] >> >> >>> For additional commands, e-mail: [hidden email] >> >> >>> >> >> >>> >> >> >> >> >> > >> >> > >> >> > >> >--------------------------------------------------------------------- >> >> > To unsubscribe, e-mail: [hidden email] >> >> > For additional commands, e-mail: [hidden email] >> >> > >> >> > >> >> >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: [hidden email] >> >> For additional commands, e-mail: [hidden email] >> >> >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [hidden email] >> For additional commands, e-mail: [hidden email] >> >> > |
In reply to this post by Ajo Fod
Le 26/08/2013 17:23, Ajo Fod a écrit :
> With regards to what is happening in DsCompiler.pow(): > IMHO, when a==0 and x>=0 the function is well behaved because log|a| -> Inf > slower than a^x -> 0. I got to this by simulation. > One could probably get to something more conclusive using L'Hopital rule : > http://en.wikipedia.org/wiki/L%27H%C3%B4pital%27s_rule. > There is one about xlog(x) behavior as x->0+. > > So, I propose this: > > if (a == 0) { > if (operand[operandOffset] >= 0) { > for (int i = 0; i < function.length; ++i) { > function[i] = 0; No. The limit value when x->0+ is 1, not O. This can be seen by drawing the graphs for a^x for x between 0 and 2 for example, and several values of a, say a=3, a=3, a=1, a=0.5, a=0.1, a=0.01, a=0.001. When a > 1, the graph starts at y=1 with a positive slope and increases exponentially to infinity. When a = 1, the graph is the horizontal line y=1. When 0 < a < 1, the graph starts at y=1 with a negative slope and decreases to reach asymptitically the line y=0. The smallest a is, the more negative the initial slope is. For very small values of a, the curve dives very quickly from its initial value y=1. The nth derivative of a^x can be computed analytically as ln(a)^n a^x, so the initial slope at x=0 is simply ln(a), positive for a > 1, zero for a = 1, negative for 0 < a < 1 with a limit at -inifnity when a -> 0+. The limit curve corresponding to a = 0 is therefore a singular function with f(0) = 1 and f(x) = 0 for all x > 0. The fact f(0) = 1 and not 0 is consistent with the derivative being negative infinity, as by definition the derivative is the limit of [f(0+h) - f(0)] / h when h->0+, as the finite difference is -1/h. > } > }else{ > for (int i = 0; i < function.length; ++i) { > function[i] = Double.NaN; > } This alternative case is a good improvement, thanks for it. I forgot to handle negative cases properly. I have therefore changed the code (committed as r1517788) with this improvement, together with several test cases. > } > } else { > > > in place of : > > if (a == 0) { > if (operand[operandOffset] == 0) { > function[0] = 1; > double infinity = Double.POSITIVE_INFINITY; > for (int i = 1; i < function.length; ++i) { > infinity = -infinity; > function[i] = infinity; > } > } > } else { > > > PS: I think you made a change to DSCompiler.pow too. If so, what happens > when a=0 & x!=0 in that function? No, I didn't change the other signatures of the pow function. So the value should be OK (i.e. 1) but all derivatives, including the first one, should be NaN. What the new function brings is a correct negetive infinity first derivative at singularity point, better accuracy for non-singular points, and possibly faster computation. best regards, Luc > > > On Mon, Aug 26, 2013 at 12:38 AM, Luc Maisonobe <[hidden email]> wrote: > >> >> >> >> Ajo Fod <[hidden email]> a écrit : >>> Are you saying patched the code? Can you provide the link? >> >> I committed it in the development version. You just have to update your >> checked out copy from either the official >> Apache subversion repository or the git mirror we talked about in a >> previous thread. >> >> The new method is a static one called pow and taking a and x as arguments >> and returning a^x. Not to >> Be confused with the non-static methods that take only the power as >> argument (either int, double or >> DerivativeStructure) and use the instance as the base to apply power on. >> >> Best regards, >> Luc >> >>> >>> -Ajo >>> >>> >>> On Sun, Aug 25, 2013 at 1:20 PM, Luc Maisonobe <[hidden email]> >>> wrote: >>> >>>> Le 24/08/2013 11:24, Luc Maisonobe a écrit : >>>>> Le 23/08/2013 19:20, Ajo Fod a écrit : >>>>>> Hello, >>>>> >>>>> Hi Ajo, >>>>> >>>>>> >>>>>> This shows one way of interpreting the derivative for strictly +ve >>>> numbers. >>>>>> >>>>>> public static void main(final String[] args) { >>>>>> final double x = 1d; >>>>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, >>> x); >>>>>> System.out.println("Derivative of |a|^x wrt x"); >>>>>> for (int p = 10; p < 21; p++) { >>>>>> double a; >>>>>> if (p < 20) { >>>>>> a = 1d / Math.pow(2d, p); >>>>>> } else { >>>>>> a = 0d; >>>>>> } >>>>>> final DerivativeStructure a_ds = new >>> DerivativeStructure(1, >>>> 1, >>>>>> a); >>>>>> final DerivativeStructure out = a_ds.pow(dsA); >>>>>> final double calc = (Math.pow(a, x + EPS) - >>> Math.pow(a, x)) >>>> / >>>>>> EPS; >>>>>> System.out.format("Derivative@%f=%f %f\n", a, calc, >>>>>> out.getPartialDerivative(new int[]{1})); >>>>>> } >>>>>> } >>>>>> >>>>>> At this point I"m explicitly substituting the rule that >>>> derivative(|a|^x) = >>>>>> 0 for |a|=0. >>>>> >>>>> Yes, but this fails for x = 0, as the limit of the finite >>> difference is >>>>> -infinity and not 0. >>>>> >>>>> You can build your own function which explicitly assumes a is >>> constant >>>>> and takes care of special values as follows: >>>>> >>>>> public static DerivativeStructure aToX(final double a, >>>>> final DerivativeStructure >>> x) { >>>>> final double lnA = (a == 0 && x.getValue() == 0) ? >>>>> Double.NEGATIVE_INFINITY : >>>>> FastMath.log(a); >>>>> final double[] function = new double[1 + x.getOrder()]; >>>>> function[0] = FastMath.pow(a, x.getValue()); >>>>> for (int i = 1; i < function.length; ++i) { >>>>> function[i] = lnA * function[i - 1]; >>>>> } >>>>> return x.compose(function); >>>>> } >>>>> >>>>> This will work and provides derivatives to any order for almost any >>>>> values of a and x, including a=0, x=1 as in your exemple, but also >>>>> slightly better for a=0, x=0. However, it still has an important >>>>> drawback: it won't compute the n-th order derivative correctly for >>> a=0, >>>>> x=0 and n > 1. It will provide NaN for these higher order >>> derivatives >>>>> instead of +/-infinity according to parity of n. >>>> >>>> I have added a similar function to the DerivativeStructure class >>> (with >>>> some errors above corrected). The main interesting property of this >>>> function is that it is more accurate that converting a to a >>>> DerivativeStructure and using the general x^y function. It does its >>> best >>>> to handle the special case, but as written above, this does NOT work >>> for >>>> general combination (i.e. more than one variable or more than one >>>> order). As soon as there is a combination, the derivative will >>> involve >>>> something like df/dx * dg/dy and as infinities and zeros are >>> everywheren >>>> NaN appears immediately for these partial derivatives. This cannot be >>>> avoided. >>>> >>>> If you stay away from the singularity, the function behaves >>> correctly. >>>> >>>> best regards, >>>> Luc >>>> >>>>> >>>>> This is a known problem that we already encountered when dealing >>> with >>>>> rootN. Here is an extract of a comment in the test case >>>>> testRootNSingularity, where similar NaN appears instead of +/- >>> infinity. >>>>> The dsZero instance in the comment is simple the x parameter of the >>>>> function, as a derivativeStructure with value 0.0 and depending on >>>>> itself (dsZero = new DerivativeStructure(1, maxOrder, 0, 0.0)): >>>>> >>>>> >>>>> // the following checks shows a LIMITATION of the current >>> implementation >>>>> // we have no way to tell dsZero is a pure linear variable x = 0 >>>>> // we only say: "dsZero is a structure with value = 0.0, >>>>> // first derivative = 1.0, second and higher derivatives = 0.0". >>>>> // Function composition rule for second derivatives is: >>>>> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) >>>>> // when function f is the nth root and x = 0 we have: >>>>> // f(0) = 0, f'(0) = +infinity, f''(0) = -infinity (and higher >>>>> // derivatives keep switching between +infinity and -infinity) >>>>> // so given that in our case dsZero represents g, we have g(x) = 0, >>>>> // g'(x) = 1 and g''(x) = 0 >>>>> // applying the composition rules gives: >>>>> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) >>>>> // = -infinity * 1^2 + +infinity * 0 >>>>> // = -infinity + NaN >>>>> // = NaN >>>>> // if we knew dsZero is really the x variable and not the identity >>>>> // function applied to x, we would not have computed f'(g(x)) * >>> g''(x) >>>>> // and we would have found that the result was -infinity and not >>> NaN >>>>> >>>>> Hope this helps >>>>> Luc >>>>> >>>>>> >>>>>> Thanks, >>>>>> Ajo. >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Aug 23, 2013 at 9:39 AM, Luc Maisonobe >>> <[hidden email] >>>>> wrote: >>>>>> >>>>>>> Hi Ajo, >>>>>>> >>>>>>> Le 23/08/2013 17:48, Ajo Fod a écrit : >>>>>>>> Try this and I'm happy to explain if necessary: >>>>>>>> >>>>>>>> public class Derivative { >>>>>>>> >>>>>>>> public static void main(final String[] args) { >>>>>>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, >>> 0, >>>> 1d); >>>>>>>> System.out.println("Derivative of constant^x wrt x"); >>>>>>>> for (int a = -3; a < 3; a++) { >>>>>>> >>>>>>> We have chosen the classical definition which implies c^x is not >>>> defined >>>>>>> for real r and negative c. >>>>>>> >>>>>>> Our implementation is based on the decomposition c^r = exp(r * >>> ln(c)), >>>>>>> so the NaN comes from the logarithm when c <= 0. >>>>>>> >>>>>>> Noe also that as explained in the documentation here: >>>>>>> < >>>>>>> >>>> >>> >> http://commons.apache.org/proper/commons-math/userguide/analysis.html#a4.7_Differentiation >>>>>>>> , >>>>>>> there are no concepts of "constants" and "variables" in this >>> framework, >>>>>>> so we cannot draw a line between c^r as seen as a univariate >>> function >>>> of >>>>>>> r, or as a univariate function of c, or as a bivariate function >>> of c >>>> and >>>>>>> r, or even as a pentavariate function of p1, p2, p3, p4, p5 with >>> both c >>>>>>> and r being computed elsewhere from p1...p5. So we don't make >>> special >>>>>>> cases for the case c = 0 for example. >>>>>>> >>>>>>> Does this explanation make sense to you? >>>>>>> >>>>>>> best regards, >>>>>>> Luc >>>>>>> >>>>>>> >>>>>>>> final DerivativeStructure a_ds = new >>>> DerivativeStructure(1, >>>>>>> 1, >>>>>>>> a); >>>>>>>> final DerivativeStructure out = a_ds.pow(dsA); >>>>>>>> System.out.format("Derivative@%d=%f\n", a, >>>>>>>> out.getPartialDerivative(new int[]{1})); >>>>>>>> } >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Aug 23, 2013 at 7:59 AM, Gilles >>> <[hidden email] >>>>>>>> wrote: >>>>>>>> >>>>>>>>> On Fri, 23 Aug 2013 07:17:35 -0700, Ajo Fod wrote: >>>>>>>>> >>>>>>>>>> Seems like the DerivativeCompiler returns NaN. >>>>>>>>>> >>>>>>>>>> IMHO it should return 0. >>>>>>>>>> >>>>>>>>> >>>>>>>>> What should be 0? And Why? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> Is this worthy of an issue? >>>>>>>>>> >>>>>>>>> >>>>>>>>> As is, no. >>>>>>>>> >>>>>>>>> Gilles >>>>>>>>> >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> -Ajo >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>> >>> ------------------------------**------------------------------**--------- >>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org< >>>>>>> [hidden email]> >>>>>>>>> For additional commands, e-mail: [hidden email] >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>> --------------------------------------------------------------------- >>>>>>> To unsubscribe, e-mail: [hidden email] >>>>>>> For additional commands, e-mail: [hidden email] >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: [hidden email] >>>>> For additional commands, e-mail: [hidden email] >>>>> >>>>> >>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: [hidden email] >>>> For additional commands, e-mail: [hidden email] >>>> >>>> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [hidden email] >> For additional commands, e-mail: [hidden email] >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Ajo Fod
Le 26/08/2013 22:37, Ajo Fod a écrit :
> On a side note. Given a derivative structure ds. Wouldn't it be nice to > generate a constant derivative structure with something like: > > ds.getConstant(dobule value); > Currently I"m doing something like: > new DerivativeStructure(length, order, value); ... seesm more verbose than > necessary when I have order and length information in existing ds all > around. Good idea. I have committed it as r1517789, simply renaming the function createConstant instead of getConstant. Thanks for the suggestion. Luc > > Cheers, > Ajo. > > > On Mon, Aug 26, 2013 at 8:23 AM, Ajo Fod <[hidden email]> wrote: > >> With regards to what is happening in DsCompiler.pow(): >> IMHO, when a==0 and x>=0 the function is well behaved because log|a| -> >> Inf slower than a^x -> 0. I got to this by simulation. >> One could probably get to something more conclusive using L'Hopital rule : >> http://en.wikipedia.org/wiki/L%27H%C3%B4pital%27s_rule. >> There is one about xlog(x) behavior as x->0+. >> >> So, I propose this: >> >> if (a == 0) { >> if (operand[operandOffset] >= 0) { >> >> for (int i = 0; i < function.length; ++i) { >> function[i] = 0; >> } >> }else{ >> >> for (int i = 0; i < function.length; ++i) { >> function[i] = Double.NaN; >> } >> } >> } else { >> >> >> in place of : >> >> if (a == 0) { >> if (operand[operandOffset] == 0) { >> function[0] = 1; >> double infinity = Double.POSITIVE_INFINITY; >> >> for (int i = 1; i < function.length; ++i) { >> infinity = -infinity; >> function[i] = infinity; >> } >> } >> } else { >> >> >> PS: I think you made a change to DSCompiler.pow too. If so, what happens >> when a=0 & x!=0 in that function? >> >> >> On Mon, Aug 26, 2013 at 12:38 AM, Luc Maisonobe <[hidden email]>wrote: >> >>> >>> >>> >>> Ajo Fod <[hidden email]> a écrit : >>>> Are you saying patched the code? Can you provide the link? >>> >>> I committed it in the development version. You just have to update your >>> checked out copy from either the official >>> Apache subversion repository or the git mirror we talked about in a >>> previous thread. >>> >>> The new method is a static one called pow and taking a and x as arguments >>> and returning a^x. Not to >>> Be confused with the non-static methods that take only the power as >>> argument (either int, double or >>> DerivativeStructure) and use the instance as the base to apply power on. >>> >>> Best regards, >>> Luc >>> >>>> >>>> -Ajo >>>> >>>> >>>> On Sun, Aug 25, 2013 at 1:20 PM, Luc Maisonobe <[hidden email]> >>>> wrote: >>>> >>>>> Le 24/08/2013 11:24, Luc Maisonobe a écrit : >>>>>> Le 23/08/2013 19:20, Ajo Fod a écrit : >>>>>>> Hello, >>>>>> >>>>>> Hi Ajo, >>>>>> >>>>>>> >>>>>>> This shows one way of interpreting the derivative for strictly +ve >>>>> numbers. >>>>>>> >>>>>>> public static void main(final String[] args) { >>>>>>> final double x = 1d; >>>>>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, >>>> x); >>>>>>> System.out.println("Derivative of |a|^x wrt x"); >>>>>>> for (int p = 10; p < 21; p++) { >>>>>>> double a; >>>>>>> if (p < 20) { >>>>>>> a = 1d / Math.pow(2d, p); >>>>>>> } else { >>>>>>> a = 0d; >>>>>>> } >>>>>>> final DerivativeStructure a_ds = new >>>> DerivativeStructure(1, >>>>> 1, >>>>>>> a); >>>>>>> final DerivativeStructure out = a_ds.pow(dsA); >>>>>>> final double calc = (Math.pow(a, x + EPS) - >>>> Math.pow(a, x)) >>>>> / >>>>>>> EPS; >>>>>>> System.out.format("Derivative@%f=%f %f\n", a, calc, >>>>>>> out.getPartialDerivative(new int[]{1})); >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> At this point I"m explicitly substituting the rule that >>>>> derivative(|a|^x) = >>>>>>> 0 for |a|=0. >>>>>> >>>>>> Yes, but this fails for x = 0, as the limit of the finite >>>> difference is >>>>>> -infinity and not 0. >>>>>> >>>>>> You can build your own function which explicitly assumes a is >>>> constant >>>>>> and takes care of special values as follows: >>>>>> >>>>>> public static DerivativeStructure aToX(final double a, >>>>>> final DerivativeStructure >>>> x) { >>>>>> final double lnA = (a == 0 && x.getValue() == 0) ? >>>>>> Double.NEGATIVE_INFINITY : >>>>>> FastMath.log(a); >>>>>> final double[] function = new double[1 + x.getOrder()]; >>>>>> function[0] = FastMath.pow(a, x.getValue()); >>>>>> for (int i = 1; i < function.length; ++i) { >>>>>> function[i] = lnA * function[i - 1]; >>>>>> } >>>>>> return x.compose(function); >>>>>> } >>>>>> >>>>>> This will work and provides derivatives to any order for almost any >>>>>> values of a and x, including a=0, x=1 as in your exemple, but also >>>>>> slightly better for a=0, x=0. However, it still has an important >>>>>> drawback: it won't compute the n-th order derivative correctly for >>>> a=0, >>>>>> x=0 and n > 1. It will provide NaN for these higher order >>>> derivatives >>>>>> instead of +/-infinity according to parity of n. >>>>> >>>>> I have added a similar function to the DerivativeStructure class >>>> (with >>>>> some errors above corrected). The main interesting property of this >>>>> function is that it is more accurate that converting a to a >>>>> DerivativeStructure and using the general x^y function. It does its >>>> best >>>>> to handle the special case, but as written above, this does NOT work >>>> for >>>>> general combination (i.e. more than one variable or more than one >>>>> order). As soon as there is a combination, the derivative will >>>> involve >>>>> something like df/dx * dg/dy and as infinities and zeros are >>>> everywheren >>>>> NaN appears immediately for these partial derivatives. This cannot be >>>>> avoided. >>>>> >>>>> If you stay away from the singularity, the function behaves >>>> correctly. >>>>> >>>>> best regards, >>>>> Luc >>>>> >>>>>> >>>>>> This is a known problem that we already encountered when dealing >>>> with >>>>>> rootN. Here is an extract of a comment in the test case >>>>>> testRootNSingularity, where similar NaN appears instead of +/- >>>> infinity. >>>>>> The dsZero instance in the comment is simple the x parameter of the >>>>>> function, as a derivativeStructure with value 0.0 and depending on >>>>>> itself (dsZero = new DerivativeStructure(1, maxOrder, 0, 0.0)): >>>>>> >>>>>> >>>>>> // the following checks shows a LIMITATION of the current >>>> implementation >>>>>> // we have no way to tell dsZero is a pure linear variable x = 0 >>>>>> // we only say: "dsZero is a structure with value = 0.0, >>>>>> // first derivative = 1.0, second and higher derivatives = 0.0". >>>>>> // Function composition rule for second derivatives is: >>>>>> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) >>>>>> // when function f is the nth root and x = 0 we have: >>>>>> // f(0) = 0, f'(0) = +infinity, f''(0) = -infinity (and higher >>>>>> // derivatives keep switching between +infinity and -infinity) >>>>>> // so given that in our case dsZero represents g, we have g(x) = 0, >>>>>> // g'(x) = 1 and g''(x) = 0 >>>>>> // applying the composition rules gives: >>>>>> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) >>>>>> // = -infinity * 1^2 + +infinity * 0 >>>>>> // = -infinity + NaN >>>>>> // = NaN >>>>>> // if we knew dsZero is really the x variable and not the identity >>>>>> // function applied to x, we would not have computed f'(g(x)) * >>>> g''(x) >>>>>> // and we would have found that the result was -infinity and not >>>> NaN >>>>>> >>>>>> Hope this helps >>>>>> Luc >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Ajo. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Aug 23, 2013 at 9:39 AM, Luc Maisonobe >>>> <[hidden email] >>>>>> wrote: >>>>>>> >>>>>>>> Hi Ajo, >>>>>>>> >>>>>>>> Le 23/08/2013 17:48, Ajo Fod a écrit : >>>>>>>>> Try this and I'm happy to explain if necessary: >>>>>>>>> >>>>>>>>> public class Derivative { >>>>>>>>> >>>>>>>>> public static void main(final String[] args) { >>>>>>>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, >>>> 0, >>>>> 1d); >>>>>>>>> System.out.println("Derivative of constant^x wrt x"); >>>>>>>>> for (int a = -3; a < 3; a++) { >>>>>>>> >>>>>>>> We have chosen the classical definition which implies c^x is not >>>>> defined >>>>>>>> for real r and negative c. >>>>>>>> >>>>>>>> Our implementation is based on the decomposition c^r = exp(r * >>>> ln(c)), >>>>>>>> so the NaN comes from the logarithm when c <= 0. >>>>>>>> >>>>>>>> Noe also that as explained in the documentation here: >>>>>>>> < >>>>>>>> >>>>> >>>> >>> http://commons.apache.org/proper/commons-math/userguide/analysis.html#a4.7_Differentiation >>>>>>>>> , >>>>>>>> there are no concepts of "constants" and "variables" in this >>>> framework, >>>>>>>> so we cannot draw a line between c^r as seen as a univariate >>>> function >>>>> of >>>>>>>> r, or as a univariate function of c, or as a bivariate function >>>> of c >>>>> and >>>>>>>> r, or even as a pentavariate function of p1, p2, p3, p4, p5 with >>>> both c >>>>>>>> and r being computed elsewhere from p1...p5. So we don't make >>>> special >>>>>>>> cases for the case c = 0 for example. >>>>>>>> >>>>>>>> Does this explanation make sense to you? >>>>>>>> >>>>>>>> best regards, >>>>>>>> Luc >>>>>>>> >>>>>>>> >>>>>>>>> final DerivativeStructure a_ds = new >>>>> DerivativeStructure(1, >>>>>>>> 1, >>>>>>>>> a); >>>>>>>>> final DerivativeStructure out = a_ds.pow(dsA); >>>>>>>>> System.out.format("Derivative@%d=%f\n", a, >>>>>>>>> out.getPartialDerivative(new int[]{1})); >>>>>>>>> } >>>>>>>>> } >>>>>>>>> } >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Aug 23, 2013 at 7:59 AM, Gilles >>>> <[hidden email] >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> On Fri, 23 Aug 2013 07:17:35 -0700, Ajo Fod wrote: >>>>>>>>>> >>>>>>>>>>> Seems like the DerivativeCompiler returns NaN. >>>>>>>>>>> >>>>>>>>>>> IMHO it should return 0. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> What should be 0? And Why? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Is this worthy of an issue? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> As is, no. >>>>>>>>>> >>>>>>>>>> Gilles >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> -Ajo >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>> >>>> ------------------------------**------------------------------**--------- >>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org< >>>>>>>> [hidden email]> >>>>>>>>>> For additional commands, e-mail: [hidden email] >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>> --------------------------------------------------------------------- >>>>>>>> To unsubscribe, e-mail: [hidden email] >>>>>>>> For additional commands, e-mail: [hidden email] >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: [hidden email] >>>>>> For additional commands, e-mail: [hidden email] >>>>>> >>>>>> >>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: [hidden email] >>>>> For additional commands, e-mail: [hidden email] >>>>> >>>>> >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [hidden email] >>> For additional commands, e-mail: [hidden email] >>> >>> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
In reply to this post by Luc Maisonobe
Thanks for the constant structure.
No. The limit value when x->0+ is 1, not O. I agree with this. I was just going for the derivatives = 0. > The nth derivative of a^x can be computed analytically as ln(a)^n a^x, > so the initial slope at x=0 is simply ln(a), positive for a > 1, zero > for a = 1, negative for 0 < a < 1 with a limit at -inifnity when a -> 0+. > Lets think about this for a sec: Derivative of |a|^x wrt x at x=2.0 for various values of a Derivative@0.031250=-0.003384 Derivative@0.015625=-0.001015 Derivative@0.007813=-0.000296 Derivative@0.003906=-0.000085 Derivative@0.001953=-0.000024 ... tends to 0 Derivative of |a|^x wrt x at x=0.5 for various values of a Derivative@0.031250=-0.612555 Derivative@0.007813=-0.428759 Derivative@0.001953=-0.275612 Derivative@0.000488=-0.168418 Derivative@0.000122=-0.099513 Derivative@0.000031=-0.057407 Derivative@0.000008=-0.032528 Derivative@0.000002=-0.018176 ... tends to 0 when a->0 The code I used for the print outs is: static final double EPS = 0.0001d; public static void main(final String[] args) { final double x = 0.5d; int from = 5; int to = 20; System.out.println("Derivative of |a|^x wrt x at x=" + x); for (int p = from; p < to; p+=2) { double a = Math.pow(2d, -p); final double calc = (Math.pow(a, x + EPS) - Math.pow(a, x)) / EPS; System.out.format("Derivative@%f=%f \n", a, calc); } } As for the x=0 case: 1^0 = 1 0.5^0 = 1 0.0001^0 = 1 0^0 is technically undefined, but 1 is a good definition: http://www.math.hmc.edu/funfacts/ffiles/10005.3-5.shtml ... so, a good value for the differential of da^x/dx limit x->0 and a->0 = 0 As mentioned earlier, I think the cause for this is that log|a| -> infinity slower than |a|^x -> 0 as |a|->0 . Cheers, Ajo. > The limit curve corresponding to a = 0 is therefore a singular function > with f(0) = 1 and f(x) = 0 for all x > 0. The fact f(0) = 1 and not 0 is > consistent with the derivative being negative infinity, as by definition > the derivative is the limit of [f(0+h) - f(0)] / h when h->0+, as the > finite difference is -1/h. > > > } > > }else{ > > for (int i = 0; i < function.length; ++i) { > > function[i] = Double.NaN; > > } > > This alternative case is a good improvement, thanks for it. I forgot to > handle negative cases properly. I have therefore changed the code > (committed as r1517788) with this improvement, together with several > test cases. > > > } > > } else { > > > > > > in place of : > > > > if (a == 0) { > > if (operand[operandOffset] == 0) { > > function[0] = 1; > > double infinity = Double.POSITIVE_INFINITY; > > for (int i = 1; i < function.length; ++i) { > > infinity = -infinity; > > function[i] = infinity; > > } > > } > > } else { > > > > > > PS: I think you made a change to DSCompiler.pow too. If so, what happens > > when a=0 & x!=0 in that function? > > No, I didn't change the other signatures of the pow function. So the > value should be OK (i.e. 1) but all derivatives, including the first > one, should be NaN. What the new function brings is a correct negetive > infinity first derivative at singularity point, better accuracy for > non-singular points, and possibly faster computation. > > best regards, > Luc > > > > > > > On Mon, Aug 26, 2013 at 12:38 AM, Luc Maisonobe <[hidden email]> > wrote: > > > >> > >> > >> > >> Ajo Fod <[hidden email]> a écrit : > >>> Are you saying patched the code? Can you provide the link? > >> > >> I committed it in the development version. You just have to update your > >> checked out copy from either the official > >> Apache subversion repository or the git mirror we talked about in a > >> previous thread. > >> > >> The new method is a static one called pow and taking a and x as > arguments > >> and returning a^x. Not to > >> Be confused with the non-static methods that take only the power as > >> argument (either int, double or > >> DerivativeStructure) and use the instance as the base to apply power on. > >> > >> Best regards, > >> Luc > >> > >>> > >>> -Ajo > >>> > >>> > >>> On Sun, Aug 25, 2013 at 1:20 PM, Luc Maisonobe <[hidden email]> > >>> wrote: > >>> > >>>> Le 24/08/2013 11:24, Luc Maisonobe a écrit : > >>>>> Le 23/08/2013 19:20, Ajo Fod a écrit : > >>>>>> Hello, > >>>>> > >>>>> Hi Ajo, > >>>>> > >>>>>> > >>>>>> This shows one way of interpreting the derivative for strictly +ve > >>>> numbers. > >>>>>> > >>>>>> public static void main(final String[] args) { > >>>>>> final double x = 1d; > >>>>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, > >>> x); > >>>>>> System.out.println("Derivative of |a|^x wrt x"); > >>>>>> for (int p = 10; p < 21; p++) { > >>>>>> double a; > >>>>>> if (p < 20) { > >>>>>> a = 1d / Math.pow(2d, p); > >>>>>> } else { > >>>>>> a = 0d; > >>>>>> } > >>>>>> final DerivativeStructure a_ds = new > >>> DerivativeStructure(1, > >>>> 1, > >>>>>> a); > >>>>>> final DerivativeStructure out = a_ds.pow(dsA); > >>>>>> final double calc = (Math.pow(a, x + EPS) - > >>> Math.pow(a, x)) > >>>> / > >>>>>> EPS; > >>>>>> System.out.format("Derivative@%f=%f %f\n", a, calc, > >>>>>> out.getPartialDerivative(new int[]{1})); > >>>>>> } > >>>>>> } > >>>>>> > >>>>>> At this point I"m explicitly substituting the rule that > >>>> derivative(|a|^x) = > >>>>>> 0 for |a|=0. > >>>>> > >>>>> Yes, but this fails for x = 0, as the limit of the finite > >>> difference is > >>>>> -infinity and not 0. > >>>>> > >>>>> You can build your own function which explicitly assumes a is > >>> constant > >>>>> and takes care of special values as follows: > >>>>> > >>>>> public static DerivativeStructure aToX(final double a, > >>>>> final DerivativeStructure > >>> x) { > >>>>> final double lnA = (a == 0 && x.getValue() == 0) ? > >>>>> Double.NEGATIVE_INFINITY : > >>>>> FastMath.log(a); > >>>>> final double[] function = new double[1 + x.getOrder()]; > >>>>> function[0] = FastMath.pow(a, x.getValue()); > >>>>> for (int i = 1; i < function.length; ++i) { > >>>>> function[i] = lnA * function[i - 1]; > >>>>> } > >>>>> return x.compose(function); > >>>>> } > >>>>> > >>>>> This will work and provides derivatives to any order for almost any > >>>>> values of a and x, including a=0, x=1 as in your exemple, but also > >>>>> slightly better for a=0, x=0. However, it still has an important > >>>>> drawback: it won't compute the n-th order derivative correctly for > >>> a=0, > >>>>> x=0 and n > 1. It will provide NaN for these higher order > >>> derivatives > >>>>> instead of +/-infinity according to parity of n. > >>>> > >>>> I have added a similar function to the DerivativeStructure class > >>> (with > >>>> some errors above corrected). The main interesting property of this > >>>> function is that it is more accurate that converting a to a > >>>> DerivativeStructure and using the general x^y function. It does its > >>> best > >>>> to handle the special case, but as written above, this does NOT work > >>> for > >>>> general combination (i.e. more than one variable or more than one > >>>> order). As soon as there is a combination, the derivative will > >>> involve > >>>> something like df/dx * dg/dy and as infinities and zeros are > >>> everywheren > >>>> NaN appears immediately for these partial derivatives. This cannot be > >>>> avoided. > >>>> > >>>> If you stay away from the singularity, the function behaves > >>> correctly. > >>>> > >>>> best regards, > >>>> Luc > >>>> > >>>>> > >>>>> This is a known problem that we already encountered when dealing > >>> with > >>>>> rootN. Here is an extract of a comment in the test case > >>>>> testRootNSingularity, where similar NaN appears instead of +/- > >>> infinity. > >>>>> The dsZero instance in the comment is simple the x parameter of the > >>>>> function, as a derivativeStructure with value 0.0 and depending on > >>>>> itself (dsZero = new DerivativeStructure(1, maxOrder, 0, 0.0)): > >>>>> > >>>>> > >>>>> // the following checks shows a LIMITATION of the current > >>> implementation > >>>>> // we have no way to tell dsZero is a pure linear variable x = 0 > >>>>> // we only say: "dsZero is a structure with value = 0.0, > >>>>> // first derivative = 1.0, second and higher derivatives = 0.0". > >>>>> // Function composition rule for second derivatives is: > >>>>> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) > >>>>> // when function f is the nth root and x = 0 we have: > >>>>> // f(0) = 0, f'(0) = +infinity, f''(0) = -infinity (and higher > >>>>> // derivatives keep switching between +infinity and -infinity) > >>>>> // so given that in our case dsZero represents g, we have g(x) = 0, > >>>>> // g'(x) = 1 and g''(x) = 0 > >>>>> // applying the composition rules gives: > >>>>> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) > >>>>> // = -infinity * 1^2 + +infinity * 0 > >>>>> // = -infinity + NaN > >>>>> // = NaN > >>>>> // if we knew dsZero is really the x variable and not the identity > >>>>> // function applied to x, we would not have computed f'(g(x)) * > >>> g''(x) > >>>>> // and we would have found that the result was -infinity and not > >>> NaN > >>>>> > >>>>> Hope this helps > >>>>> Luc > >>>>> > >>>>>> > >>>>>> Thanks, > >>>>>> Ajo. > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Fri, Aug 23, 2013 at 9:39 AM, Luc Maisonobe > >>> <[hidden email] > >>>>> wrote: > >>>>>> > >>>>>>> Hi Ajo, > >>>>>>> > >>>>>>> Le 23/08/2013 17:48, Ajo Fod a écrit : > >>>>>>>> Try this and I'm happy to explain if necessary: > >>>>>>>> > >>>>>>>> public class Derivative { > >>>>>>>> > >>>>>>>> public static void main(final String[] args) { > >>>>>>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, > >>> 0, > >>>> 1d); > >>>>>>>> System.out.println("Derivative of constant^x wrt x"); > >>>>>>>> for (int a = -3; a < 3; a++) { > >>>>>>> > >>>>>>> We have chosen the classical definition which implies c^x is not > >>>> defined > >>>>>>> for real r and negative c. > >>>>>>> > >>>>>>> Our implementation is based on the decomposition c^r = exp(r * > >>> ln(c)), > >>>>>>> so the NaN comes from the logarithm when c <= 0. > >>>>>>> > >>>>>>> Noe also that as explained in the documentation here: > >>>>>>> < > >>>>>>> > >>>> > >>> > >> > http://commons.apache.org/proper/commons-math/userguide/analysis.html#a4.7_Differentiation > >>>>>>>> , > >>>>>>> there are no concepts of "constants" and "variables" in this > >>> framework, > >>>>>>> so we cannot draw a line between c^r as seen as a univariate > >>> function > >>>> of > >>>>>>> r, or as a univariate function of c, or as a bivariate function > >>> of c > >>>> and > >>>>>>> r, or even as a pentavariate function of p1, p2, p3, p4, p5 with > >>> both c > >>>>>>> and r being computed elsewhere from p1...p5. So we don't make > >>> special > >>>>>>> cases for the case c = 0 for example. > >>>>>>> > >>>>>>> Does this explanation make sense to you? > >>>>>>> > >>>>>>> best regards, > >>>>>>> Luc > >>>>>>> > >>>>>>> > >>>>>>>> final DerivativeStructure a_ds = new > >>>> DerivativeStructure(1, > >>>>>>> 1, > >>>>>>>> a); > >>>>>>>> final DerivativeStructure out = a_ds.pow(dsA); > >>>>>>>> System.out.format("Derivative@%d=%f\n", a, > >>>>>>>> out.getPartialDerivative(new int[]{1})); > >>>>>>>> } > >>>>>>>> } > >>>>>>>> } > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On Fri, Aug 23, 2013 at 7:59 AM, Gilles > >>> <[hidden email] > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> On Fri, 23 Aug 2013 07:17:35 -0700, Ajo Fod wrote: > >>>>>>>>> > >>>>>>>>>> Seems like the DerivativeCompiler returns NaN. > >>>>>>>>>> > >>>>>>>>>> IMHO it should return 0. > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> What should be 0? And Why? > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> Is this worthy of an issue? > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> As is, no. > >>>>>>>>> > >>>>>>>>> Gilles > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> -Ajo > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>> > >>> > ------------------------------**------------------------------**--------- > >>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org< > >>>>>>> [hidden email]> > >>>>>>>>> For additional commands, e-mail: [hidden email] > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>> --------------------------------------------------------------------- > >>>>>>> To unsubscribe, e-mail: [hidden email] > >>>>>>> For additional commands, e-mail: [hidden email] > >>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>>> > >>>>> > >>> --------------------------------------------------------------------- > >>>>> To unsubscribe, e-mail: [hidden email] > >>>>> For additional commands, e-mail: [hidden email] > >>>>> > >>>>> > >>>> > >>>> > >>>> --------------------------------------------------------------------- > >>>> To unsubscribe, e-mail: [hidden email] > >>>> For additional commands, e-mail: [hidden email] > >>>> > >>>> > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [hidden email] > >> For additional commands, e-mail: [hidden email] > >> > >> > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [hidden email] > For additional commands, e-mail: [hidden email] > > |
Hi Ajo,
Le 27/08/2013 16:44, Ajo Fod a écrit : > Thanks for the constant structure. > > No. The limit value when x->0+ is 1, not O. > > I agree with this. I was just going for the derivatives = 0. > > >> The nth derivative of a^x can be computed analytically as ln(a)^n a^x, >> so the initial slope at x=0 is simply ln(a), positive for a > 1, zero >> for a = 1, negative for 0 < a < 1 with a limit at -inifnity when a -> 0+. >> > > Lets think about this for a sec: > Derivative of |a|^x wrt x at x=2.0 for various values of a > Derivative@0.031250=-0.003384 > Derivative@0.015625=-0.001015 > Derivative@0.007813=-0.000296 > Derivative@0.003906=-0.000085 > Derivative@0.001953=-0.000024 > ... tends to 0 yes, because 2.0 > 0. > > Derivative of |a|^x wrt x at x=0.5 for various values of a > Derivative@0.031250=-0.612555 > Derivative@0.007813=-0.428759 > Derivative@0.001953=-0.275612 > Derivative@0.000488=-0.168418 > Derivative@0.000122=-0.099513 > Derivative@0.000031=-0.057407 > Derivative@0.000008=-0.032528 > Derivative@0.000002=-0.018176 > ... tends to 0 when a->0 yes because 0.5 > 0. > > The code I used for the print outs is: > static final double EPS = 0.0001d; > > public static void main(final String[] args) { > final double x = 0.5d; > int from = 5; > int to = 20; > System.out.println("Derivative of |a|^x wrt x at x=" + x); > for (int p = from; p < to; p+=2) { > double a = Math.pow(2d, -p); > final double calc = (Math.pow(a, x + EPS) - Math.pow(a, x)) / > EPS; > System.out.format("Derivative@%f=%f \n", a, calc); > } > } > > As for the x=0 case: > 1^0 = 1 > 0.5^0 = 1 > 0.0001^0 = 1 > 0^0 is technically undefined, but 1 is a good definition: > http://www.math.hmc.edu/funfacts/ffiles/10005.3-5.shtml Yes. > ... so, a good value for the differential of da^x/dx limit x->0 and a->0 = > 0 I don't agree. What you wrote in the lines above is another way to say what I wrote in my previous message: the value at x=0 is always y=1, and the value for x > 0 tends to 0 as a->0+. So the function always starts at 1 and dives more and more steeply as a becomes smaller, and the derivative at 0 becomes more and more negative, up to -infinity, *not* 0. The function is ill-behaved and the fact the derivative is infinite is consistent with this ill-behaviour. The definition of the derivative is : f'(x) = lim (f(x+h) - f(x))/h when h -> 0+ when f(x) = 0^x and assuming 0^0 = 1 as you have agreed above, this gives: f'(0) = lim (0^(0+h) - 0^0)/h = lim (0 - 1)/h = -infinity which is exactly the same result as computing for a non-null a and then reducing it: d(a^x)/dx = ln(a) a^x = ln(a) when x=0, diverges to -infinity when a converges to 0. > > > As mentioned earlier, I think the cause for this is that log|a| -> infinity > slower than |a|^x -> 0 as |a|->0 . But a^x does *not* converge to 0 for x = 0! a^0 is always 1 (rigorously) regardless of the value of a as long as it is not 0, and then when we change a we can also consider the limit is 1 when a-> 0. This convention is well accepted. This convention is implemented in the Java standard Math.pow function, and we followed this trend. This is the reason why the functions becomes more and more steep as a becomes smaller. At the end, it is a discontinuous function (and hence should not be differentiable, or it is differentiable only if we use extended real numbers with infinity added). This is the heart of the ill-behaviour of 0^0. We want to compute it as a limit value for a^b when both parameters converge to 0, but we get a different result if we first set a fixed and converge b to 0, and later reduce a down to zero (your approach), and when we do the opposite. In one case we get 0, in the other case we get 1. Lets put it another way: If we consider the derivative f'(0) should be 0, then the value f(0) should also be considered equal to zero. This would mean as soon as we get a tiny non-zero a (say the smallest number that can be represented as a double), then f(0) would jump from 0 to 1 instantly, and f'(0) would jump from 0 to -infinity instantly. So we would have at a = 0 an initial null derivative, then a jump to a very negative derivative as a leaves 0, then the derivative would become less and less negative as a increase up to 1, at a=1 the derivative would again be 0, then the derivative would continue to increase and becode positive as a grows larger than 1 (all these derivatives are computed at x=0, and as written previously, they are simply equal to log(a)). To summarize, the two choices are: 1) - first considering a fixed a, strictly positive, - then looking globally at the function a^x for all values x>=0, - then reducing a, noting that all functions start at the same point x=0, y=1 and the derivatives become more and more negative as the function becomes more and more ill-behaved 2) - first considering a fixed x, strictly positive, - then reducing a and identifying the limit values is 0 for all a, - then building a function by packing all the x>0, which is very smooth as it is identically 0 for all x>0 - finally adding the limit value at x=0, which in this case would be 0 (and the derivative would also be 0). it seems well accepted to consider the value of 0^0 should be set to 1, and as a consequence the corresponding derivative with respect to x should be set to -infinity. I fully agree it is not a perfect solution, it is an arbitrary choice. However, this choice is consistent with what all implementations of the pow function I have seen (i.e. 0^0 set to 1 instead of 0). Your approach is not wrong, it is as valid as the other one. It is simply not the common choice. I would say an even better choice would have been to say 0^0 *is not* defined and even the value should be set to NaN (not even speaking of the derivative). Does this seem acceptable to you? best regards, Luc > > Cheers, > Ajo. > > >> The limit curve corresponding to a = 0 is therefore a singular function >> with f(0) = 1 and f(x) = 0 for all x > 0. The fact f(0) = 1 and not 0 is >> consistent with the derivative being negative infinity, as by definition >> the derivative is the limit of [f(0+h) - f(0)] / h when h->0+, as the >> finite difference is -1/h. >> >>> } >>> }else{ >>> for (int i = 0; i < function.length; ++i) { >>> function[i] = Double.NaN; >>> } >> >> This alternative case is a good improvement, thanks for it. I forgot to >> handle negative cases properly. I have therefore changed the code >> (committed as r1517788) with this improvement, together with several >> test cases. >> >>> } >>> } else { >>> >>> >>> in place of : >>> >>> if (a == 0) { >>> if (operand[operandOffset] == 0) { >>> function[0] = 1; >>> double infinity = Double.POSITIVE_INFINITY; >>> for (int i = 1; i < function.length; ++i) { >>> infinity = -infinity; >>> function[i] = infinity; >>> } >>> } >>> } else { >>> >>> >>> PS: I think you made a change to DSCompiler.pow too. If so, what happens >>> when a=0 & x!=0 in that function? >> >> No, I didn't change the other signatures of the pow function. So the >> value should be OK (i.e. 1) but all derivatives, including the first >> one, should be NaN. What the new function brings is a correct negetive >> infinity first derivative at singularity point, better accuracy for >> non-singular points, and possibly faster computation. >> >> best regards, >> Luc >> >>> >>> >>> On Mon, Aug 26, 2013 at 12:38 AM, Luc Maisonobe <[hidden email]> >> wrote: >>> >>>> >>>> >>>> >>>> Ajo Fod <[hidden email]> a écrit : >>>>> Are you saying patched the code? Can you provide the link? >>>> >>>> I committed it in the development version. You just have to update your >>>> checked out copy from either the official >>>> Apache subversion repository or the git mirror we talked about in a >>>> previous thread. >>>> >>>> The new method is a static one called pow and taking a and x as >> arguments >>>> and returning a^x. Not to >>>> Be confused with the non-static methods that take only the power as >>>> argument (either int, double or >>>> DerivativeStructure) and use the instance as the base to apply power on. >>>> >>>> Best regards, >>>> Luc >>>> >>>>> >>>>> -Ajo >>>>> >>>>> >>>>> On Sun, Aug 25, 2013 at 1:20 PM, Luc Maisonobe <[hidden email]> >>>>> wrote: >>>>> >>>>>> Le 24/08/2013 11:24, Luc Maisonobe a écrit : >>>>>>> Le 23/08/2013 19:20, Ajo Fod a écrit : >>>>>>>> Hello, >>>>>>> >>>>>>> Hi Ajo, >>>>>>> >>>>>>>> >>>>>>>> This shows one way of interpreting the derivative for strictly +ve >>>>>> numbers. >>>>>>>> >>>>>>>> public static void main(final String[] args) { >>>>>>>> final double x = 1d; >>>>>>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, >>>>> x); >>>>>>>> System.out.println("Derivative of |a|^x wrt x"); >>>>>>>> for (int p = 10; p < 21; p++) { >>>>>>>> double a; >>>>>>>> if (p < 20) { >>>>>>>> a = 1d / Math.pow(2d, p); >>>>>>>> } else { >>>>>>>> a = 0d; >>>>>>>> } >>>>>>>> final DerivativeStructure a_ds = new >>>>> DerivativeStructure(1, >>>>>> 1, >>>>>>>> a); >>>>>>>> final DerivativeStructure out = a_ds.pow(dsA); >>>>>>>> final double calc = (Math.pow(a, x + EPS) - >>>>> Math.pow(a, x)) >>>>>> / >>>>>>>> EPS; >>>>>>>> System.out.format("Derivative@%f=%f %f\n", a, calc, >>>>>>>> out.getPartialDerivative(new int[]{1})); >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> At this point I"m explicitly substituting the rule that >>>>>> derivative(|a|^x) = >>>>>>>> 0 for |a|=0. >>>>>>> >>>>>>> Yes, but this fails for x = 0, as the limit of the finite >>>>> difference is >>>>>>> -infinity and not 0. >>>>>>> >>>>>>> You can build your own function which explicitly assumes a is >>>>> constant >>>>>>> and takes care of special values as follows: >>>>>>> >>>>>>> public static DerivativeStructure aToX(final double a, >>>>>>> final DerivativeStructure >>>>> x) { >>>>>>> final double lnA = (a == 0 && x.getValue() == 0) ? >>>>>>> Double.NEGATIVE_INFINITY : >>>>>>> FastMath.log(a); >>>>>>> final double[] function = new double[1 + x.getOrder()]; >>>>>>> function[0] = FastMath.pow(a, x.getValue()); >>>>>>> for (int i = 1; i < function.length; ++i) { >>>>>>> function[i] = lnA * function[i - 1]; >>>>>>> } >>>>>>> return x.compose(function); >>>>>>> } >>>>>>> >>>>>>> This will work and provides derivatives to any order for almost any >>>>>>> values of a and x, including a=0, x=1 as in your exemple, but also >>>>>>> slightly better for a=0, x=0. However, it still has an important >>>>>>> drawback: it won't compute the n-th order derivative correctly for >>>>> a=0, >>>>>>> x=0 and n > 1. It will provide NaN for these higher order >>>>> derivatives >>>>>>> instead of +/-infinity according to parity of n. >>>>>> >>>>>> I have added a similar function to the DerivativeStructure class >>>>> (with >>>>>> some errors above corrected). The main interesting property of this >>>>>> function is that it is more accurate that converting a to a >>>>>> DerivativeStructure and using the general x^y function. It does its >>>>> best >>>>>> to handle the special case, but as written above, this does NOT work >>>>> for >>>>>> general combination (i.e. more than one variable or more than one >>>>>> order). As soon as there is a combination, the derivative will >>>>> involve >>>>>> something like df/dx * dg/dy and as infinities and zeros are >>>>> everywheren >>>>>> NaN appears immediately for these partial derivatives. This cannot be >>>>>> avoided. >>>>>> >>>>>> If you stay away from the singularity, the function behaves >>>>> correctly. >>>>>> >>>>>> best regards, >>>>>> Luc >>>>>> >>>>>>> >>>>>>> This is a known problem that we already encountered when dealing >>>>> with >>>>>>> rootN. Here is an extract of a comment in the test case >>>>>>> testRootNSingularity, where similar NaN appears instead of +/- >>>>> infinity. >>>>>>> The dsZero instance in the comment is simple the x parameter of the >>>>>>> function, as a derivativeStructure with value 0.0 and depending on >>>>>>> itself (dsZero = new DerivativeStructure(1, maxOrder, 0, 0.0)): >>>>>>> >>>>>>> >>>>>>> // the following checks shows a LIMITATION of the current >>>>> implementation >>>>>>> // we have no way to tell dsZero is a pure linear variable x = 0 >>>>>>> // we only say: "dsZero is a structure with value = 0.0, >>>>>>> // first derivative = 1.0, second and higher derivatives = 0.0". >>>>>>> // Function composition rule for second derivatives is: >>>>>>> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) >>>>>>> // when function f is the nth root and x = 0 we have: >>>>>>> // f(0) = 0, f'(0) = +infinity, f''(0) = -infinity (and higher >>>>>>> // derivatives keep switching between +infinity and -infinity) >>>>>>> // so given that in our case dsZero represents g, we have g(x) = 0, >>>>>>> // g'(x) = 1 and g''(x) = 0 >>>>>>> // applying the composition rules gives: >>>>>>> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) >>>>>>> // = -infinity * 1^2 + +infinity * 0 >>>>>>> // = -infinity + NaN >>>>>>> // = NaN >>>>>>> // if we knew dsZero is really the x variable and not the identity >>>>>>> // function applied to x, we would not have computed f'(g(x)) * >>>>> g''(x) >>>>>>> // and we would have found that the result was -infinity and not >>>>> NaN >>>>>>> >>>>>>> Hope this helps >>>>>>> Luc >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Ajo. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Aug 23, 2013 at 9:39 AM, Luc Maisonobe >>>>> <[hidden email] >>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Ajo, >>>>>>>>> >>>>>>>>> Le 23/08/2013 17:48, Ajo Fod a écrit : >>>>>>>>>> Try this and I'm happy to explain if necessary: >>>>>>>>>> >>>>>>>>>> public class Derivative { >>>>>>>>>> >>>>>>>>>> public static void main(final String[] args) { >>>>>>>>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, >>>>> 0, >>>>>> 1d); >>>>>>>>>> System.out.println("Derivative of constant^x wrt x"); >>>>>>>>>> for (int a = -3; a < 3; a++) { >>>>>>>>> >>>>>>>>> We have chosen the classical definition which implies c^x is not >>>>>> defined >>>>>>>>> for real r and negative c. >>>>>>>>> >>>>>>>>> Our implementation is based on the decomposition c^r = exp(r * >>>>> ln(c)), >>>>>>>>> so the NaN comes from the logarithm when c <= 0. >>>>>>>>> >>>>>>>>> Noe also that as explained in the documentation here: >>>>>>>>> < >>>>>>>>> >>>>>> >>>>> >>>> >> http://commons.apache.org/proper/commons-math/userguide/analysis.html#a4.7_Differentiation >>>>>>>>>> , >>>>>>>>> there are no concepts of "constants" and "variables" in this >>>>> framework, >>>>>>>>> so we cannot draw a line between c^r as seen as a univariate >>>>> function >>>>>> of >>>>>>>>> r, or as a univariate function of c, or as a bivariate function >>>>> of c >>>>>> and >>>>>>>>> r, or even as a pentavariate function of p1, p2, p3, p4, p5 with >>>>> both c >>>>>>>>> and r being computed elsewhere from p1...p5. So we don't make >>>>> special >>>>>>>>> cases for the case c = 0 for example. >>>>>>>>> >>>>>>>>> Does this explanation make sense to you? >>>>>>>>> >>>>>>>>> best regards, >>>>>>>>> Luc >>>>>>>>> >>>>>>>>> >>>>>>>>>> final DerivativeStructure a_ds = new >>>>>> DerivativeStructure(1, >>>>>>>>> 1, >>>>>>>>>> a); >>>>>>>>>> final DerivativeStructure out = a_ds.pow(dsA); >>>>>>>>>> System.out.format("Derivative@%d=%f\n", a, >>>>>>>>>> out.getPartialDerivative(new int[]{1})); >>>>>>>>>> } >>>>>>>>>> } >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Aug 23, 2013 at 7:59 AM, Gilles >>>>> <[hidden email] >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> On Fri, 23 Aug 2013 07:17:35 -0700, Ajo Fod wrote: >>>>>>>>>>> >>>>>>>>>>>> Seems like the DerivativeCompiler returns NaN. >>>>>>>>>>>> >>>>>>>>>>>> IMHO it should return 0. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> What should be 0? And Why? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Is this worthy of an issue? >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> As is, no. >>>>>>>>>>> >>>>>>>>>>> Gilles >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> -Ajo >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>> >>>>> >> ------------------------------**------------------------------**--------- >>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org< >>>>>>>>> [hidden email]> >>>>>>>>>>> For additional commands, e-mail: [hidden email] >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>> --------------------------------------------------------------------- >>>>>>>>> To unsubscribe, e-mail: [hidden email] >>>>>>>>> For additional commands, e-mail: [hidden email] >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>> --------------------------------------------------------------------- >>>>>>> To unsubscribe, e-mail: [hidden email] >>>>>>> For additional commands, e-mail: [hidden email] >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: [hidden email] >>>>>> For additional commands, e-mail: [hidden email] >>>>>> >>>>>> >>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: [hidden email] >>>> For additional commands, e-mail: [hidden email] >>>> >>>> >>> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [hidden email] >> For additional commands, e-mail: [hidden email] >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
To define things precisely:
y = f(a,x) = |a|^x Can we agree that: df(a,x)/dx -> 0 when a->0 and x > 0 :[ NOTE: x > 0] If this is acceptable, we get this very useful property that df (a,x)/dx is defined and continuous for all a provided x>0 because we use the modulus of a in the function definition. In optimization, with this patch at |a|=0, I can set an optimizer to search the whole real line without worrying about a=0 otherwise I've to look out for a=0 explicitly. It seems unnecessary to add a constraint to make |a|>0. I already have a constraint for x >0. Cheers, Ajo. On Tue, Aug 27, 2013 at 1:49 PM, Luc Maisonobe <[hidden email]>wrote: > Hi Ajo, > > Le 27/08/2013 16:44, Ajo Fod a écrit : > > Thanks for the constant structure. > > > > No. The limit value when x->0+ is 1, not O. > > > > I agree with this. I was just going for the derivatives = 0. > > > > > >> The nth derivative of a^x can be computed analytically as ln(a)^n a^x, > >> so the initial slope at x=0 is simply ln(a), positive for a > 1, zero > >> for a = 1, negative for 0 < a < 1 with a limit at -inifnity when a -> > 0+. > >> > > > > Lets think about this for a sec: > > Derivative of |a|^x wrt x at x=2.0 for various values of a > > Derivative@0.031250=-0.003384 > > Derivative@0.015625=-0.001015 > > Derivative@0.007813=-0.000296 > > Derivative@0.003906=-0.000085 > > Derivative@0.001953=-0.000024 > > ... tends to 0 > > yes, because 2.0 > 0. > > > > > Derivative of |a|^x wrt x at x=0.5 for various values of a > > Derivative@0.031250=-0.612555 > > Derivative@0.007813=-0.428759 > > Derivative@0.001953=-0.275612 > > Derivative@0.000488=-0.168418 > > Derivative@0.000122=-0.099513 > > Derivative@0.000031=-0.057407 > > Derivative@0.000008=-0.032528 > > Derivative@0.000002=-0.018176 > > ... tends to 0 when a->0 > > yes because 0.5 > 0. > > > > > The code I used for the print outs is: > > static final double EPS = 0.0001d; > > > > public static void main(final String[] args) { > > final double x = 0.5d; > > int from = 5; > > int to = 20; > > System.out.println("Derivative of |a|^x wrt x at x=" + x); > > for (int p = from; p < to; p+=2) { > > double a = Math.pow(2d, -p); > > final double calc = (Math.pow(a, x + EPS) - Math.pow(a, x)) / > > EPS; > > System.out.format("Derivative@%f=%f \n", a, calc); > > } > > } > > > > As for the x=0 case: > > 1^0 = 1 > > 0.5^0 = 1 > > 0.0001^0 = 1 > > 0^0 is technically undefined, but 1 is a good definition: > > http://www.math.hmc.edu/funfacts/ffiles/10005.3-5.shtml > > Yes. > > > ... so, a good value for the differential of da^x/dx limit x->0 and > a->0 = > > 0 > > I don't agree. What you wrote in the lines above is another way to say > what I wrote in my previous message: the value at x=0 is always y=1, and > the value for x > 0 tends to 0 as a->0+. > > So the function always starts at 1 and dives more and more steeply as a > becomes smaller, and the derivative at 0 becomes more and more negative, > up to -infinity, *not* 0. > > The function is ill-behaved and the fact the derivative is infinite is > consistent with this ill-behaviour. > > The definition of the derivative is : > > f'(x) = lim (f(x+h) - f(x))/h when h -> 0+ > > when f(x) = 0^x and assuming 0^0 = 1 as you have agreed above, this gives: > > f'(0) = lim (0^(0+h) - 0^0)/h = lim (0 - 1)/h = -infinity > > which is exactly the same result as computing for a non-null a and then > reducing it: d(a^x)/dx = ln(a) a^x = ln(a) when x=0, diverges to > -infinity when a converges to 0. > > > > > > > As mentioned earlier, I think the cause for this is that log|a| -> > infinity > > slower than |a|^x -> 0 as |a|->0 . > > But a^x does *not* converge to 0 for x = 0! a^0 is always 1 (rigorously) > regardless of the value of a as long as it is not 0, and then when we > change a we can also consider the limit is 1 when a-> 0. This convention > is well accepted. This convention is implemented in the Java standard > Math.pow function, and we followed this trend. This is the reason why > the functions becomes more and more steep as a becomes smaller. At the > end, it is a discontinuous function (and hence should not be > differentiable, or it is differentiable only if we use extended real > numbers with infinity added). > > This is the heart of the ill-behaviour of 0^0. We want to compute it as > a limit value for a^b when both parameters converge to 0, but we get a > different result if we first set a fixed and converge b to 0, and later > reduce a down to zero (your approach), and when we do the opposite. In > one case we get 0, in the other case we get 1. > > Lets put it another way: > If we consider the derivative f'(0) should be 0, then the value f(0) > should also be considered equal to zero. This would mean as soon as we > get a tiny non-zero a (say the smallest number that can be represented > as a double), then f(0) would jump from 0 to 1 instantly, and f'(0) > would jump from 0 to -infinity instantly. So we would have at a = 0 an > initial null derivative, then a jump to a very negative derivative as a > leaves 0, then the derivative would become less and less negative as a > increase up to 1, at a=1 the derivative would again be 0, then the > derivative would continue to increase and becode positive as a grows > larger than 1 (all these derivatives are computed at x=0, and as written > previously, they are simply equal to log(a)). > > To summarize, the two choices are: > 1) - first considering a fixed a, strictly positive, > - then looking globally at the function a^x for all values x>=0, > - then reducing a, noting that all functions start at the same > point x=0, y=1 and the derivatives become more and more negative > as the function becomes more and more ill-behaved > 2) - first considering a fixed x, strictly positive, > - then reducing a and identifying the limit values is 0 for all a, > - then building a function by packing all the x>0, which is very > smooth as it is identically 0 for all x>0 > - finally adding the limit value at x=0, which in this case would > be 0 (and the derivative would also be 0). > > it seems well accepted to consider the value of 0^0 should be set to 1, > and as a consequence the corresponding derivative with respect to x > should be set to -infinity. > > I fully agree it is not a perfect solution, it is an arbitrary choice. > However, this choice is consistent with what all implementations of the > pow function I have seen (i.e. 0^0 set to 1 instead of 0). > > Your approach is not wrong, it is as valid as the other one. It is > simply not the common choice. > > I would say an even better choice would have been to say 0^0 *is not* > defined and even the value should be set to NaN (not even speaking of > the derivative). > > Does this seem acceptable to you? > > best regards, > Luc > > > > > Cheers, > > Ajo. > > > > > >> The limit curve corresponding to a = 0 is therefore a singular function > >> with f(0) = 1 and f(x) = 0 for all x > 0. The fact f(0) = 1 and not 0 is > >> consistent with the derivative being negative infinity, as by definition > >> the derivative is the limit of [f(0+h) - f(0)] / h when h->0+, as the > >> finite difference is -1/h. > >> > >>> } > >>> }else{ > >>> for (int i = 0; i < function.length; ++i) { > >>> function[i] = Double.NaN; > >>> } > >> > >> This alternative case is a good improvement, thanks for it. I forgot to > >> handle negative cases properly. I have therefore changed the code > >> (committed as r1517788) with this improvement, together with several > >> test cases. > >> > >>> } > >>> } else { > >>> > >>> > >>> in place of : > >>> > >>> if (a == 0) { > >>> if (operand[operandOffset] == 0) { > >>> function[0] = 1; > >>> double infinity = Double.POSITIVE_INFINITY; > >>> for (int i = 1; i < function.length; ++i) { > >>> infinity = -infinity; > >>> function[i] = infinity; > >>> } > >>> } > >>> } else { > >>> > >>> > >>> PS: I think you made a change to DSCompiler.pow too. If so, what > happens > >>> when a=0 & x!=0 in that function? > >> > >> No, I didn't change the other signatures of the pow function. So the > >> value should be OK (i.e. 1) but all derivatives, including the first > >> one, should be NaN. What the new function brings is a correct negetive > >> infinity first derivative at singularity point, better accuracy for > >> non-singular points, and possibly faster computation. > >> > >> best regards, > >> Luc > >> > >>> > >>> > >>> On Mon, Aug 26, 2013 at 12:38 AM, Luc Maisonobe <[hidden email]> > >> wrote: > >>> > >>>> > >>>> > >>>> > >>>> Ajo Fod <[hidden email]> a écrit : > >>>>> Are you saying patched the code? Can you provide the link? > >>>> > >>>> I committed it in the development version. You just have to update > your > >>>> checked out copy from either the official > >>>> Apache subversion repository or the git mirror we talked about in a > >>>> previous thread. > >>>> > >>>> The new method is a static one called pow and taking a and x as > >> arguments > >>>> and returning a^x. Not to > >>>> Be confused with the non-static methods that take only the power as > >>>> argument (either int, double or > >>>> DerivativeStructure) and use the instance as the base to apply power > on. > >>>> > >>>> Best regards, > >>>> Luc > >>>> > >>>>> > >>>>> -Ajo > >>>>> > >>>>> > >>>>> On Sun, Aug 25, 2013 at 1:20 PM, Luc Maisonobe <[hidden email]> > >>>>> wrote: > >>>>> > >>>>>> Le 24/08/2013 11:24, Luc Maisonobe a écrit : > >>>>>>> Le 23/08/2013 19:20, Ajo Fod a écrit : > >>>>>>>> Hello, > >>>>>>> > >>>>>>> Hi Ajo, > >>>>>>> > >>>>>>>> > >>>>>>>> This shows one way of interpreting the derivative for strictly +ve > >>>>>> numbers. > >>>>>>>> > >>>>>>>> public static void main(final String[] args) { > >>>>>>>> final double x = 1d; > >>>>>>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, > >>>>> x); > >>>>>>>> System.out.println("Derivative of |a|^x wrt x"); > >>>>>>>> for (int p = 10; p < 21; p++) { > >>>>>>>> double a; > >>>>>>>> if (p < 20) { > >>>>>>>> a = 1d / Math.pow(2d, p); > >>>>>>>> } else { > >>>>>>>> a = 0d; > >>>>>>>> } > >>>>>>>> final DerivativeStructure a_ds = new > >>>>> DerivativeStructure(1, > >>>>>> 1, > >>>>>>>> a); > >>>>>>>> final DerivativeStructure out = a_ds.pow(dsA); > >>>>>>>> final double calc = (Math.pow(a, x + EPS) - > >>>>> Math.pow(a, x)) > >>>>>> / > >>>>>>>> EPS; > >>>>>>>> System.out.format("Derivative@%f=%f %f\n", a, calc, > >>>>>>>> out.getPartialDerivative(new int[]{1})); > >>>>>>>> } > >>>>>>>> } > >>>>>>>> > >>>>>>>> At this point I"m explicitly substituting the rule that > >>>>>> derivative(|a|^x) = > >>>>>>>> 0 for |a|=0. > >>>>>>> > >>>>>>> Yes, but this fails for x = 0, as the limit of the finite > >>>>> difference is > >>>>>>> -infinity and not 0. > >>>>>>> > >>>>>>> You can build your own function which explicitly assumes a is > >>>>> constant > >>>>>>> and takes care of special values as follows: > >>>>>>> > >>>>>>> public static DerivativeStructure aToX(final double a, > >>>>>>> final DerivativeStructure > >>>>> x) { > >>>>>>> final double lnA = (a == 0 && x.getValue() == 0) ? > >>>>>>> Double.NEGATIVE_INFINITY : > >>>>>>> FastMath.log(a); > >>>>>>> final double[] function = new double[1 + x.getOrder()]; > >>>>>>> function[0] = FastMath.pow(a, x.getValue()); > >>>>>>> for (int i = 1; i < function.length; ++i) { > >>>>>>> function[i] = lnA * function[i - 1]; > >>>>>>> } > >>>>>>> return x.compose(function); > >>>>>>> } > >>>>>>> > >>>>>>> This will work and provides derivatives to any order for almost any > >>>>>>> values of a and x, including a=0, x=1 as in your exemple, but also > >>>>>>> slightly better for a=0, x=0. However, it still has an important > >>>>>>> drawback: it won't compute the n-th order derivative correctly for > >>>>> a=0, > >>>>>>> x=0 and n > 1. It will provide NaN for these higher order > >>>>> derivatives > >>>>>>> instead of +/-infinity according to parity of n. > >>>>>> > >>>>>> I have added a similar function to the DerivativeStructure class > >>>>> (with > >>>>>> some errors above corrected). The main interesting property of this > >>>>>> function is that it is more accurate that converting a to a > >>>>>> DerivativeStructure and using the general x^y function. It does its > >>>>> best > >>>>>> to handle the special case, but as written above, this does NOT work > >>>>> for > >>>>>> general combination (i.e. more than one variable or more than one > >>>>>> order). As soon as there is a combination, the derivative will > >>>>> involve > >>>>>> something like df/dx * dg/dy and as infinities and zeros are > >>>>> everywheren > >>>>>> NaN appears immediately for these partial derivatives. This cannot > be > >>>>>> avoided. > >>>>>> > >>>>>> If you stay away from the singularity, the function behaves > >>>>> correctly. > >>>>>> > >>>>>> best regards, > >>>>>> Luc > >>>>>> > >>>>>>> > >>>>>>> This is a known problem that we already encountered when dealing > >>>>> with > >>>>>>> rootN. Here is an extract of a comment in the test case > >>>>>>> testRootNSingularity, where similar NaN appears instead of +/- > >>>>> infinity. > >>>>>>> The dsZero instance in the comment is simple the x parameter of the > >>>>>>> function, as a derivativeStructure with value 0.0 and depending on > >>>>>>> itself (dsZero = new DerivativeStructure(1, maxOrder, 0, 0.0)): > >>>>>>> > >>>>>>> > >>>>>>> // the following checks shows a LIMITATION of the current > >>>>> implementation > >>>>>>> // we have no way to tell dsZero is a pure linear variable x = 0 > >>>>>>> // we only say: "dsZero is a structure with value = 0.0, > >>>>>>> // first derivative = 1.0, second and higher derivatives = 0.0". > >>>>>>> // Function composition rule for second derivatives is: > >>>>>>> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) > >>>>>>> // when function f is the nth root and x = 0 we have: > >>>>>>> // f(0) = 0, f'(0) = +infinity, f''(0) = -infinity (and higher > >>>>>>> // derivatives keep switching between +infinity and -infinity) > >>>>>>> // so given that in our case dsZero represents g, we have g(x) = 0, > >>>>>>> // g'(x) = 1 and g''(x) = 0 > >>>>>>> // applying the composition rules gives: > >>>>>>> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) > >>>>>>> // = -infinity * 1^2 + +infinity * 0 > >>>>>>> // = -infinity + NaN > >>>>>>> // = NaN > >>>>>>> // if we knew dsZero is really the x variable and not the identity > >>>>>>> // function applied to x, we would not have computed f'(g(x)) * > >>>>> g''(x) > >>>>>>> // and we would have found that the result was -infinity and not > >>>>> NaN > >>>>>>> > >>>>>>> Hope this helps > >>>>>>> Luc > >>>>>>> > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Ajo. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On Fri, Aug 23, 2013 at 9:39 AM, Luc Maisonobe > >>>>> <[hidden email] > >>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Hi Ajo, > >>>>>>>>> > >>>>>>>>> Le 23/08/2013 17:48, Ajo Fod a écrit : > >>>>>>>>>> Try this and I'm happy to explain if necessary: > >>>>>>>>>> > >>>>>>>>>> public class Derivative { > >>>>>>>>>> > >>>>>>>>>> public static void main(final String[] args) { > >>>>>>>>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, > >>>>> 0, > >>>>>> 1d); > >>>>>>>>>> System.out.println("Derivative of constant^x wrt x"); > >>>>>>>>>> for (int a = -3; a < 3; a++) { > >>>>>>>>> > >>>>>>>>> We have chosen the classical definition which implies c^x is not > >>>>>> defined > >>>>>>>>> for real r and negative c. > >>>>>>>>> > >>>>>>>>> Our implementation is based on the decomposition c^r = exp(r * > >>>>> ln(c)), > >>>>>>>>> so the NaN comes from the logarithm when c <= 0. > >>>>>>>>> > >>>>>>>>> Noe also that as explained in the documentation here: > >>>>>>>>> < > >>>>>>>>> > >>>>>> > >>>>> > >>>> > >> > http://commons.apache.org/proper/commons-math/userguide/analysis.html#a4.7_Differentiation > >>>>>>>>>> , > >>>>>>>>> there are no concepts of "constants" and "variables" in this > >>>>> framework, > >>>>>>>>> so we cannot draw a line between c^r as seen as a univariate > >>>>> function > >>>>>> of > >>>>>>>>> r, or as a univariate function of c, or as a bivariate function > >>>>> of c > >>>>>> and > >>>>>>>>> r, or even as a pentavariate function of p1, p2, p3, p4, p5 with > >>>>> both c > >>>>>>>>> and r being computed elsewhere from p1...p5. So we don't make > >>>>> special > >>>>>>>>> cases for the case c = 0 for example. > >>>>>>>>> > >>>>>>>>> Does this explanation make sense to you? > >>>>>>>>> > >>>>>>>>> best regards, > >>>>>>>>> Luc > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> final DerivativeStructure a_ds = new > >>>>>> DerivativeStructure(1, > >>>>>>>>> 1, > >>>>>>>>>> a); > >>>>>>>>>> final DerivativeStructure out = a_ds.pow(dsA); > >>>>>>>>>> System.out.format("Derivative@%d=%f\n", a, > >>>>>>>>>> out.getPartialDerivative(new int[]{1})); > >>>>>>>>>> } > >>>>>>>>>> } > >>>>>>>>>> } > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Fri, Aug 23, 2013 at 7:59 AM, Gilles > >>>>> <[hidden email] > >>>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> On Fri, 23 Aug 2013 07:17:35 -0700, Ajo Fod wrote: > >>>>>>>>>>> > >>>>>>>>>>>> Seems like the DerivativeCompiler returns NaN. > >>>>>>>>>>>> > >>>>>>>>>>>> IMHO it should return 0. > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> What should be 0? And Why? > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>> Is this worthy of an issue? > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> As is, no. > >>>>>>>>>>> > >>>>>>>>>>> Gilles > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>> Thanks, > >>>>>>>>>>>> -Ajo > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>> > >>>>>> > >>>>> > >> > ------------------------------**------------------------------**--------- > >>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org< > >>>>>>>>> [hidden email]> > >>>>>>>>>>> For additional commands, e-mail: [hidden email] > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>> --------------------------------------------------------------------- > >>>>>>>>> To unsubscribe, e-mail: [hidden email] > >>>>>>>>> For additional commands, e-mail: [hidden email] > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>> --------------------------------------------------------------------- > >>>>>>> To unsubscribe, e-mail: [hidden email] > >>>>>>> For additional commands, e-mail: [hidden email] > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>>> > --------------------------------------------------------------------- > >>>>>> To unsubscribe, e-mail: [hidden email] > >>>>>> For additional commands, e-mail: [hidden email] > >>>>>> > >>>>>> > >>>> > >>>> > >>>> --------------------------------------------------------------------- > >>>> To unsubscribe, e-mail: [hidden email] > >>>> For additional commands, e-mail: [hidden email] > >>>> > >>>> > >>> > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [hidden email] > >> For additional commands, e-mail: [hidden email] > >> > >> > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [hidden email] > For additional commands, e-mail: [hidden email] > > |
Hi Ajo,
Le 28/08/2013 16:56, Ajo Fod a écrit : > To define things precisely: > y = f(a,x) = |a|^x > > Can we agree that: > df(a,x)/dx -> 0 when a->0 and x > 0 :[ NOTE: x > 0] Yes, of course, it is perfectly true. > > If this is acceptable, we get this very useful property that df (a,x)/dx is > defined and continuous for all a provided x>0 because we use the modulus of > a in the function definition. Yes, as long as we don't have x = 0, we remain in a smooth, indefinitely differentiable domain. > In optimization, with this patch at |a|=0, I > can set an optimizer to search the whole real line without worrying about > a=0 otherwise I've to look out for a=0 explicitly. It seems unnecessary to > add a constraint to make |a|>0. I already have a constraint for x >0. I don't understand what you mean here. If you already know that x > 0, then you don't have to worry about a=0 or a>0 since in this case both approaches lead to the same result. If you look at the graph for df(a,x)/dx for a few values of a, you will see that we have: lim a->0+ df(a,x)/dx = 0 for x > 0 lim a->0+ df(a,x)/dx = -infinity for x = 0 and this despite df(a,x)/dx = ln(a) a^x is a continuous function, indefinitely differentiable. The limit of a continuous indefinitely differentiable function may be a non continuous function. It is a counter-intuitive result, I agree, but thre are many other examples of such strange behaviour in mathematics (if I remember well, Fourier transforms of step function exhibit the same paroble, backward). If you have x>0, you are already on the safe side of the singularity, so this is were I lose your tracks and don't understand how the singular point x=0 bothers you. best regards, Luc > > Cheers, > Ajo. > > > > On Tue, Aug 27, 2013 at 1:49 PM, Luc Maisonobe <[hidden email]>wrote: > >> Hi Ajo, >> >> Le 27/08/2013 16:44, Ajo Fod a écrit : >>> Thanks for the constant structure. >>> >>> No. The limit value when x->0+ is 1, not O. >>> >>> I agree with this. I was just going for the derivatives = 0. >>> >>> >>>> The nth derivative of a^x can be computed analytically as ln(a)^n a^x, >>>> so the initial slope at x=0 is simply ln(a), positive for a > 1, zero >>>> for a = 1, negative for 0 < a < 1 with a limit at -inifnity when a -> >> 0+. >>>> >>> >>> Lets think about this for a sec: >>> Derivative of |a|^x wrt x at x=2.0 for various values of a >>> Derivative@0.031250=-0.003384 >>> Derivative@0.015625=-0.001015 >>> Derivative@0.007813=-0.000296 >>> Derivative@0.003906=-0.000085 >>> Derivative@0.001953=-0.000024 >>> ... tends to 0 >> >> yes, because 2.0 > 0. >> >>> >>> Derivative of |a|^x wrt x at x=0.5 for various values of a >>> Derivative@0.031250=-0.612555 >>> Derivative@0.007813=-0.428759 >>> Derivative@0.001953=-0.275612 >>> Derivative@0.000488=-0.168418 >>> Derivative@0.000122=-0.099513 >>> Derivative@0.000031=-0.057407 >>> Derivative@0.000008=-0.032528 >>> Derivative@0.000002=-0.018176 >>> ... tends to 0 when a->0 >> >> yes because 0.5 > 0. >> >>> >>> The code I used for the print outs is: >>> static final double EPS = 0.0001d; >>> >>> public static void main(final String[] args) { >>> final double x = 0.5d; >>> int from = 5; >>> int to = 20; >>> System.out.println("Derivative of |a|^x wrt x at x=" + x); >>> for (int p = from; p < to; p+=2) { >>> double a = Math.pow(2d, -p); >>> final double calc = (Math.pow(a, x + EPS) - Math.pow(a, x)) / >>> EPS; >>> System.out.format("Derivative@%f=%f \n", a, calc); >>> } >>> } >>> >>> As for the x=0 case: >>> 1^0 = 1 >>> 0.5^0 = 1 >>> 0.0001^0 = 1 >>> 0^0 is technically undefined, but 1 is a good definition: >>> http://www.math.hmc.edu/funfacts/ffiles/10005.3-5.shtml >> >> Yes. >> >>> ... so, a good value for the differential of da^x/dx limit x->0 and >> a->0 = >>> 0 >> >> I don't agree. What you wrote in the lines above is another way to say >> what I wrote in my previous message: the value at x=0 is always y=1, and >> the value for x > 0 tends to 0 as a->0+. >> >> So the function always starts at 1 and dives more and more steeply as a >> becomes smaller, and the derivative at 0 becomes more and more negative, >> up to -infinity, *not* 0. >> >> The function is ill-behaved and the fact the derivative is infinite is >> consistent with this ill-behaviour. >> >> The definition of the derivative is : >> >> f'(x) = lim (f(x+h) - f(x))/h when h -> 0+ >> >> when f(x) = 0^x and assuming 0^0 = 1 as you have agreed above, this gives: >> >> f'(0) = lim (0^(0+h) - 0^0)/h = lim (0 - 1)/h = -infinity >> >> which is exactly the same result as computing for a non-null a and then >> reducing it: d(a^x)/dx = ln(a) a^x = ln(a) when x=0, diverges to >> -infinity when a converges to 0. >> >>> >>> >>> As mentioned earlier, I think the cause for this is that log|a| -> >> infinity >>> slower than |a|^x -> 0 as |a|->0 . >> >> But a^x does *not* converge to 0 for x = 0! a^0 is always 1 (rigorously) >> regardless of the value of a as long as it is not 0, and then when we >> change a we can also consider the limit is 1 when a-> 0. This convention >> is well accepted. This convention is implemented in the Java standard >> Math.pow function, and we followed this trend. This is the reason why >> the functions becomes more and more steep as a becomes smaller. At the >> end, it is a discontinuous function (and hence should not be >> differentiable, or it is differentiable only if we use extended real >> numbers with infinity added). >> >> This is the heart of the ill-behaviour of 0^0. We want to compute it as >> a limit value for a^b when both parameters converge to 0, but we get a >> different result if we first set a fixed and converge b to 0, and later >> reduce a down to zero (your approach), and when we do the opposite. In >> one case we get 0, in the other case we get 1. >> >> Lets put it another way: >> If we consider the derivative f'(0) should be 0, then the value f(0) >> should also be considered equal to zero. This would mean as soon as we >> get a tiny non-zero a (say the smallest number that can be represented >> as a double), then f(0) would jump from 0 to 1 instantly, and f'(0) >> would jump from 0 to -infinity instantly. So we would have at a = 0 an >> initial null derivative, then a jump to a very negative derivative as a >> leaves 0, then the derivative would become less and less negative as a >> increase up to 1, at a=1 the derivative would again be 0, then the >> derivative would continue to increase and becode positive as a grows >> larger than 1 (all these derivatives are computed at x=0, and as written >> previously, they are simply equal to log(a)). >> >> To summarize, the two choices are: >> 1) - first considering a fixed a, strictly positive, >> - then looking globally at the function a^x for all values x>=0, >> - then reducing a, noting that all functions start at the same >> point x=0, y=1 and the derivatives become more and more negative >> as the function becomes more and more ill-behaved >> 2) - first considering a fixed x, strictly positive, >> - then reducing a and identifying the limit values is 0 for all a, >> - then building a function by packing all the x>0, which is very >> smooth as it is identically 0 for all x>0 >> - finally adding the limit value at x=0, which in this case would >> be 0 (and the derivative would also be 0). >> >> it seems well accepted to consider the value of 0^0 should be set to 1, >> and as a consequence the corresponding derivative with respect to x >> should be set to -infinity. >> >> I fully agree it is not a perfect solution, it is an arbitrary choice. >> However, this choice is consistent with what all implementations of the >> pow function I have seen (i.e. 0^0 set to 1 instead of 0). >> >> Your approach is not wrong, it is as valid as the other one. It is >> simply not the common choice. >> >> I would say an even better choice would have been to say 0^0 *is not* >> defined and even the value should be set to NaN (not even speaking of >> the derivative). >> >> Does this seem acceptable to you? >> >> best regards, >> Luc >> >>> >>> Cheers, >>> Ajo. >>> >>> >>>> The limit curve corresponding to a = 0 is therefore a singular function >>>> with f(0) = 1 and f(x) = 0 for all x > 0. The fact f(0) = 1 and not 0 is >>>> consistent with the derivative being negative infinity, as by definition >>>> the derivative is the limit of [f(0+h) - f(0)] / h when h->0+, as the >>>> finite difference is -1/h. >>>> >>>>> } >>>>> }else{ >>>>> for (int i = 0; i < function.length; ++i) { >>>>> function[i] = Double.NaN; >>>>> } >>>> >>>> This alternative case is a good improvement, thanks for it. I forgot to >>>> handle negative cases properly. I have therefore changed the code >>>> (committed as r1517788) with this improvement, together with several >>>> test cases. >>>> >>>>> } >>>>> } else { >>>>> >>>>> >>>>> in place of : >>>>> >>>>> if (a == 0) { >>>>> if (operand[operandOffset] == 0) { >>>>> function[0] = 1; >>>>> double infinity = Double.POSITIVE_INFINITY; >>>>> for (int i = 1; i < function.length; ++i) { >>>>> infinity = -infinity; >>>>> function[i] = infinity; >>>>> } >>>>> } >>>>> } else { >>>>> >>>>> >>>>> PS: I think you made a change to DSCompiler.pow too. If so, what >> happens >>>>> when a=0 & x!=0 in that function? >>>> >>>> No, I didn't change the other signatures of the pow function. So the >>>> value should be OK (i.e. 1) but all derivatives, including the first >>>> one, should be NaN. What the new function brings is a correct negetive >>>> infinity first derivative at singularity point, better accuracy for >>>> non-singular points, and possibly faster computation. >>>> >>>> best regards, >>>> Luc >>>> >>>>> >>>>> >>>>> On Mon, Aug 26, 2013 at 12:38 AM, Luc Maisonobe <[hidden email]> >>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> >>>>>> Ajo Fod <[hidden email]> a écrit : >>>>>>> Are you saying patched the code? Can you provide the link? >>>>>> >>>>>> I committed it in the development version. You just have to update >> your >>>>>> checked out copy from either the official >>>>>> Apache subversion repository or the git mirror we talked about in a >>>>>> previous thread. >>>>>> >>>>>> The new method is a static one called pow and taking a and x as >>>> arguments >>>>>> and returning a^x. Not to >>>>>> Be confused with the non-static methods that take only the power as >>>>>> argument (either int, double or >>>>>> DerivativeStructure) and use the instance as the base to apply power >> on. >>>>>> >>>>>> Best regards, >>>>>> Luc >>>>>> >>>>>>> >>>>>>> -Ajo >>>>>>> >>>>>>> >>>>>>> On Sun, Aug 25, 2013 at 1:20 PM, Luc Maisonobe <[hidden email]> >>>>>>> wrote: >>>>>>> >>>>>>>> Le 24/08/2013 11:24, Luc Maisonobe a écrit : >>>>>>>>> Le 23/08/2013 19:20, Ajo Fod a écrit : >>>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> Hi Ajo, >>>>>>>>> >>>>>>>>>> >>>>>>>>>> This shows one way of interpreting the derivative for strictly +ve >>>>>>>> numbers. >>>>>>>>>> >>>>>>>>>> public static void main(final String[] args) { >>>>>>>>>> final double x = 1d; >>>>>>>>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, 0, >>>>>>> x); >>>>>>>>>> System.out.println("Derivative of |a|^x wrt x"); >>>>>>>>>> for (int p = 10; p < 21; p++) { >>>>>>>>>> double a; >>>>>>>>>> if (p < 20) { >>>>>>>>>> a = 1d / Math.pow(2d, p); >>>>>>>>>> } else { >>>>>>>>>> a = 0d; >>>>>>>>>> } >>>>>>>>>> final DerivativeStructure a_ds = new >>>>>>> DerivativeStructure(1, >>>>>>>> 1, >>>>>>>>>> a); >>>>>>>>>> final DerivativeStructure out = a_ds.pow(dsA); >>>>>>>>>> final double calc = (Math.pow(a, x + EPS) - >>>>>>> Math.pow(a, x)) >>>>>>>> / >>>>>>>>>> EPS; >>>>>>>>>> System.out.format("Derivative@%f=%f %f\n", a, calc, >>>>>>>>>> out.getPartialDerivative(new int[]{1})); >>>>>>>>>> } >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> At this point I"m explicitly substituting the rule that >>>>>>>> derivative(|a|^x) = >>>>>>>>>> 0 for |a|=0. >>>>>>>>> >>>>>>>>> Yes, but this fails for x = 0, as the limit of the finite >>>>>>> difference is >>>>>>>>> -infinity and not 0. >>>>>>>>> >>>>>>>>> You can build your own function which explicitly assumes a is >>>>>>> constant >>>>>>>>> and takes care of special values as follows: >>>>>>>>> >>>>>>>>> public static DerivativeStructure aToX(final double a, >>>>>>>>> final DerivativeStructure >>>>>>> x) { >>>>>>>>> final double lnA = (a == 0 && x.getValue() == 0) ? >>>>>>>>> Double.NEGATIVE_INFINITY : >>>>>>>>> FastMath.log(a); >>>>>>>>> final double[] function = new double[1 + x.getOrder()]; >>>>>>>>> function[0] = FastMath.pow(a, x.getValue()); >>>>>>>>> for (int i = 1; i < function.length; ++i) { >>>>>>>>> function[i] = lnA * function[i - 1]; >>>>>>>>> } >>>>>>>>> return x.compose(function); >>>>>>>>> } >>>>>>>>> >>>>>>>>> This will work and provides derivatives to any order for almost any >>>>>>>>> values of a and x, including a=0, x=1 as in your exemple, but also >>>>>>>>> slightly better for a=0, x=0. However, it still has an important >>>>>>>>> drawback: it won't compute the n-th order derivative correctly for >>>>>>> a=0, >>>>>>>>> x=0 and n > 1. It will provide NaN for these higher order >>>>>>> derivatives >>>>>>>>> instead of +/-infinity according to parity of n. >>>>>>>> >>>>>>>> I have added a similar function to the DerivativeStructure class >>>>>>> (with >>>>>>>> some errors above corrected). The main interesting property of this >>>>>>>> function is that it is more accurate that converting a to a >>>>>>>> DerivativeStructure and using the general x^y function. It does its >>>>>>> best >>>>>>>> to handle the special case, but as written above, this does NOT work >>>>>>> for >>>>>>>> general combination (i.e. more than one variable or more than one >>>>>>>> order). As soon as there is a combination, the derivative will >>>>>>> involve >>>>>>>> something like df/dx * dg/dy and as infinities and zeros are >>>>>>> everywheren >>>>>>>> NaN appears immediately for these partial derivatives. This cannot >> be >>>>>>>> avoided. >>>>>>>> >>>>>>>> If you stay away from the singularity, the function behaves >>>>>>> correctly. >>>>>>>> >>>>>>>> best regards, >>>>>>>> Luc >>>>>>>> >>>>>>>>> >>>>>>>>> This is a known problem that we already encountered when dealing >>>>>>> with >>>>>>>>> rootN. Here is an extract of a comment in the test case >>>>>>>>> testRootNSingularity, where similar NaN appears instead of +/- >>>>>>> infinity. >>>>>>>>> The dsZero instance in the comment is simple the x parameter of the >>>>>>>>> function, as a derivativeStructure with value 0.0 and depending on >>>>>>>>> itself (dsZero = new DerivativeStructure(1, maxOrder, 0, 0.0)): >>>>>>>>> >>>>>>>>> >>>>>>>>> // the following checks shows a LIMITATION of the current >>>>>>> implementation >>>>>>>>> // we have no way to tell dsZero is a pure linear variable x = 0 >>>>>>>>> // we only say: "dsZero is a structure with value = 0.0, >>>>>>>>> // first derivative = 1.0, second and higher derivatives = 0.0". >>>>>>>>> // Function composition rule for second derivatives is: >>>>>>>>> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) >>>>>>>>> // when function f is the nth root and x = 0 we have: >>>>>>>>> // f(0) = 0, f'(0) = +infinity, f''(0) = -infinity (and higher >>>>>>>>> // derivatives keep switching between +infinity and -infinity) >>>>>>>>> // so given that in our case dsZero represents g, we have g(x) = 0, >>>>>>>>> // g'(x) = 1 and g''(x) = 0 >>>>>>>>> // applying the composition rules gives: >>>>>>>>> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) >>>>>>>>> // = -infinity * 1^2 + +infinity * 0 >>>>>>>>> // = -infinity + NaN >>>>>>>>> // = NaN >>>>>>>>> // if we knew dsZero is really the x variable and not the identity >>>>>>>>> // function applied to x, we would not have computed f'(g(x)) * >>>>>>> g''(x) >>>>>>>>> // and we would have found that the result was -infinity and not >>>>>>> NaN >>>>>>>>> >>>>>>>>> Hope this helps >>>>>>>>> Luc >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Ajo. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Aug 23, 2013 at 9:39 AM, Luc Maisonobe >>>>>>> <[hidden email] >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Ajo, >>>>>>>>>>> >>>>>>>>>>> Le 23/08/2013 17:48, Ajo Fod a écrit : >>>>>>>>>>>> Try this and I'm happy to explain if necessary: >>>>>>>>>>>> >>>>>>>>>>>> public class Derivative { >>>>>>>>>>>> >>>>>>>>>>>> public static void main(final String[] args) { >>>>>>>>>>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, >>>>>>> 0, >>>>>>>> 1d); >>>>>>>>>>>> System.out.println("Derivative of constant^x wrt x"); >>>>>>>>>>>> for (int a = -3; a < 3; a++) { >>>>>>>>>>> >>>>>>>>>>> We have chosen the classical definition which implies c^x is not >>>>>>>> defined >>>>>>>>>>> for real r and negative c. >>>>>>>>>>> >>>>>>>>>>> Our implementation is based on the decomposition c^r = exp(r * >>>>>>> ln(c)), >>>>>>>>>>> so the NaN comes from the logarithm when c <= 0. >>>>>>>>>>> >>>>>>>>>>> Noe also that as explained in the documentation here: >>>>>>>>>>> < >>>>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>> >> http://commons.apache.org/proper/commons-math/userguide/analysis.html#a4.7_Differentiation >>>>>>>>>>>> , >>>>>>>>>>> there are no concepts of "constants" and "variables" in this >>>>>>> framework, >>>>>>>>>>> so we cannot draw a line between c^r as seen as a univariate >>>>>>> function >>>>>>>> of >>>>>>>>>>> r, or as a univariate function of c, or as a bivariate function >>>>>>> of c >>>>>>>> and >>>>>>>>>>> r, or even as a pentavariate function of p1, p2, p3, p4, p5 with >>>>>>> both c >>>>>>>>>>> and r being computed elsewhere from p1...p5. So we don't make >>>>>>> special >>>>>>>>>>> cases for the case c = 0 for example. >>>>>>>>>>> >>>>>>>>>>> Does this explanation make sense to you? >>>>>>>>>>> >>>>>>>>>>> best regards, >>>>>>>>>>> Luc >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> final DerivativeStructure a_ds = new >>>>>>>> DerivativeStructure(1, >>>>>>>>>>> 1, >>>>>>>>>>>> a); >>>>>>>>>>>> final DerivativeStructure out = a_ds.pow(dsA); >>>>>>>>>>>> System.out.format("Derivative@%d=%f\n", a, >>>>>>>>>>>> out.getPartialDerivative(new int[]{1})); >>>>>>>>>>>> } >>>>>>>>>>>> } >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Aug 23, 2013 at 7:59 AM, Gilles >>>>>>> <[hidden email] >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> On Fri, 23 Aug 2013 07:17:35 -0700, Ajo Fod wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Seems like the DerivativeCompiler returns NaN. >>>>>>>>>>>>>> >>>>>>>>>>>>>> IMHO it should return 0. >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> What should be 0? And Why? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Is this worthy of an issue? >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> As is, no. >>>>>>>>>>>>> >>>>>>>>>>>>> Gilles >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> -Ajo >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>> >>>>>>> >>>> >> ------------------------------**------------------------------**--------- >>>>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org< >>>>>>>>>>> [hidden email]> >>>>>>>>>>>>> For additional commands, e-mail: [hidden email] >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>>>>>> To unsubscribe, e-mail: [hidden email] >>>>>>>>>>> For additional commands, e-mail: [hidden email] >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>>>> To unsubscribe, e-mail: [hidden email] >>>>>>>>> For additional commands, e-mail: [hidden email] >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >> --------------------------------------------------------------------- >>>>>>>> To unsubscribe, e-mail: [hidden email] >>>>>>>> For additional commands, e-mail: [hidden email] >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: [hidden email] >>>>>> For additional commands, e-mail: [hidden email] >>>>>> >>>>>> >>>>> >>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: [hidden email] >>>> For additional commands, e-mail: [hidden email] >>>> >>>> >>> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [hidden email] >> For additional commands, e-mail: [hidden email] >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] |
Its a=0 that bothers me. x > 0 in my case.
In the code I use, the DerivativeStructure evaluates to NaN for a=0 when x > 0 . I think we agree that in this condition the derivative should evaluate to 0. Perhaps I wrote something to mislead you on this detail. -Ajo On Wed, Aug 28, 2013 at 10:36 AM, Luc Maisonobe <[hidden email]>wrote: > Hi Ajo, > > Le 28/08/2013 16:56, Ajo Fod a écrit : > > To define things precisely: > > y = f(a,x) = |a|^x > > > > Can we agree that: > > df(a,x)/dx -> 0 when a->0 and x > 0 :[ NOTE: x > 0] > > Yes, of course, it is perfectly true. > > > > > If this is acceptable, we get this very useful property that df (a,x)/dx > is > > defined and continuous for all a provided x>0 because we use the modulus > of > > a in the function definition. > > Yes, as long as we don't have x = 0, we remain in a smooth, indefinitely > differentiable domain. > > > In optimization, with this patch at |a|=0, I > > can set an optimizer to search the whole real line without worrying about > > a=0 otherwise I've to look out for a=0 explicitly. It seems unnecessary > to > > add a constraint to make |a|>0. I already have a constraint for x >0. > > I don't understand what you mean here. If you already know that x > 0, > then you don't have to worry about a=0 or a>0 since in this case both > approaches lead to the same result. > > If you look at the graph for df(a,x)/dx for a few values of a, you will > see that we have: > > lim a->0+ df(a,x)/dx = 0 for x > 0 > lim a->0+ df(a,x)/dx = -infinity for x = 0 > > and this despite df(a,x)/dx = ln(a) a^x is a continuous function, > indefinitely differentiable. The limit of a continuous indefinitely > differentiable function may be a non continuous function. It is a > counter-intuitive result, I agree, but thre are many other examples of > such strange behaviour in mathematics (if I remember well, Fourier > transforms of step function exhibit the same paroble, backward). > > If you have x>0, you are already on the safe side of the singularity, so > this is were I lose your tracks and don't understand how the singular > point x=0 bothers you. > > best regards, > Luc > > > > > Cheers, > > Ajo. > > > > > > > > On Tue, Aug 27, 2013 at 1:49 PM, Luc Maisonobe <[hidden email] > >wrote: > > > >> Hi Ajo, > >> > >> Le 27/08/2013 16:44, Ajo Fod a écrit : > >>> Thanks for the constant structure. > >>> > >>> No. The limit value when x->0+ is 1, not O. > >>> > >>> I agree with this. I was just going for the derivatives = 0. > >>> > >>> > >>>> The nth derivative of a^x can be computed analytically as ln(a)^n a^x, > >>>> so the initial slope at x=0 is simply ln(a), positive for a > 1, zero > >>>> for a = 1, negative for 0 < a < 1 with a limit at -inifnity when a -> > >> 0+. > >>>> > >>> > >>> Lets think about this for a sec: > >>> Derivative of |a|^x wrt x at x=2.0 for various values of a > >>> Derivative@0.031250=-0.003384 > >>> Derivative@0.015625=-0.001015 > >>> Derivative@0.007813=-0.000296 > >>> Derivative@0.003906=-0.000085 > >>> Derivative@0.001953=-0.000024 > >>> ... tends to 0 > >> > >> yes, because 2.0 > 0. > >> > >>> > >>> Derivative of |a|^x wrt x at x=0.5 for various values of a > >>> Derivative@0.031250=-0.612555 > >>> Derivative@0.007813=-0.428759 > >>> Derivative@0.001953=-0.275612 > >>> Derivative@0.000488=-0.168418 > >>> Derivative@0.000122=-0.099513 > >>> Derivative@0.000031=-0.057407 > >>> Derivative@0.000008=-0.032528 > >>> Derivative@0.000002=-0.018176 > >>> ... tends to 0 when a->0 > >> > >> yes because 0.5 > 0. > >> > >>> > >>> The code I used for the print outs is: > >>> static final double EPS = 0.0001d; > >>> > >>> public static void main(final String[] args) { > >>> final double x = 0.5d; > >>> int from = 5; > >>> int to = 20; > >>> System.out.println("Derivative of |a|^x wrt x at x=" + x); > >>> for (int p = from; p < to; p+=2) { > >>> double a = Math.pow(2d, -p); > >>> final double calc = (Math.pow(a, x + EPS) - Math.pow(a, > x)) / > >>> EPS; > >>> System.out.format("Derivative@%f=%f \n", a, calc); > >>> } > >>> } > >>> > >>> As for the x=0 case: > >>> 1^0 = 1 > >>> 0.5^0 = 1 > >>> 0.0001^0 = 1 > >>> 0^0 is technically undefined, but 1 is a good definition: > >>> http://www.math.hmc.edu/funfacts/ffiles/10005.3-5.shtml > >> > >> Yes. > >> > >>> ... so, a good value for the differential of da^x/dx limit x->0 and > >> a->0 = > >>> 0 > >> > >> I don't agree. What you wrote in the lines above is another way to say > >> what I wrote in my previous message: the value at x=0 is always y=1, and > >> the value for x > 0 tends to 0 as a->0+. > >> > >> So the function always starts at 1 and dives more and more steeply as a > >> becomes smaller, and the derivative at 0 becomes more and more negative, > >> up to -infinity, *not* 0. > >> > >> The function is ill-behaved and the fact the derivative is infinite is > >> consistent with this ill-behaviour. > >> > >> The definition of the derivative is : > >> > >> f'(x) = lim (f(x+h) - f(x))/h when h -> 0+ > >> > >> when f(x) = 0^x and assuming 0^0 = 1 as you have agreed above, this > gives: > >> > >> f'(0) = lim (0^(0+h) - 0^0)/h = lim (0 - 1)/h = -infinity > >> > >> which is exactly the same result as computing for a non-null a and then > >> reducing it: d(a^x)/dx = ln(a) a^x = ln(a) when x=0, diverges to > >> -infinity when a converges to 0. > >> > >>> > >>> > >>> As mentioned earlier, I think the cause for this is that log|a| -> > >> infinity > >>> slower than |a|^x -> 0 as |a|->0 . > >> > >> But a^x does *not* converge to 0 for x = 0! a^0 is always 1 (rigorously) > >> regardless of the value of a as long as it is not 0, and then when we > >> change a we can also consider the limit is 1 when a-> 0. This convention > >> is well accepted. This convention is implemented in the Java standard > >> Math.pow function, and we followed this trend. This is the reason why > >> the functions becomes more and more steep as a becomes smaller. At the > >> end, it is a discontinuous function (and hence should not be > >> differentiable, or it is differentiable only if we use extended real > >> numbers with infinity added). > >> > >> This is the heart of the ill-behaviour of 0^0. We want to compute it as > >> a limit value for a^b when both parameters converge to 0, but we get a > >> different result if we first set a fixed and converge b to 0, and later > >> reduce a down to zero (your approach), and when we do the opposite. In > >> one case we get 0, in the other case we get 1. > >> > >> Lets put it another way: > >> If we consider the derivative f'(0) should be 0, then the value f(0) > >> should also be considered equal to zero. This would mean as soon as we > >> get a tiny non-zero a (say the smallest number that can be represented > >> as a double), then f(0) would jump from 0 to 1 instantly, and f'(0) > >> would jump from 0 to -infinity instantly. So we would have at a = 0 an > >> initial null derivative, then a jump to a very negative derivative as a > >> leaves 0, then the derivative would become less and less negative as a > >> increase up to 1, at a=1 the derivative would again be 0, then the > >> derivative would continue to increase and becode positive as a grows > >> larger than 1 (all these derivatives are computed at x=0, and as written > >> previously, they are simply equal to log(a)). > >> > >> To summarize, the two choices are: > >> 1) - first considering a fixed a, strictly positive, > >> - then looking globally at the function a^x for all values x>=0, > >> - then reducing a, noting that all functions start at the same > >> point x=0, y=1 and the derivatives become more and more negative > >> as the function becomes more and more ill-behaved > >> 2) - first considering a fixed x, strictly positive, > >> - then reducing a and identifying the limit values is 0 for all a, > >> - then building a function by packing all the x>0, which is very > >> smooth as it is identically 0 for all x>0 > >> - finally adding the limit value at x=0, which in this case would > >> be 0 (and the derivative would also be 0). > >> > >> it seems well accepted to consider the value of 0^0 should be set to 1, > >> and as a consequence the corresponding derivative with respect to x > >> should be set to -infinity. > >> > >> I fully agree it is not a perfect solution, it is an arbitrary choice. > >> However, this choice is consistent with what all implementations of the > >> pow function I have seen (i.e. 0^0 set to 1 instead of 0). > >> > >> Your approach is not wrong, it is as valid as the other one. It is > >> simply not the common choice. > >> > >> I would say an even better choice would have been to say 0^0 *is not* > >> defined and even the value should be set to NaN (not even speaking of > >> the derivative). > >> > >> Does this seem acceptable to you? > >> > >> best regards, > >> Luc > >> > >>> > >>> Cheers, > >>> Ajo. > >>> > >>> > >>>> The limit curve corresponding to a = 0 is therefore a singular > function > >>>> with f(0) = 1 and f(x) = 0 for all x > 0. The fact f(0) = 1 and not 0 > is > >>>> consistent with the derivative being negative infinity, as by > definition > >>>> the derivative is the limit of [f(0+h) - f(0)] / h when h->0+, as the > >>>> finite difference is -1/h. > >>>> > >>>>> } > >>>>> }else{ > >>>>> for (int i = 0; i < function.length; ++i) { > >>>>> function[i] = Double.NaN; > >>>>> } > >>>> > >>>> This alternative case is a good improvement, thanks for it. I forgot > to > >>>> handle negative cases properly. I have therefore changed the code > >>>> (committed as r1517788) with this improvement, together with several > >>>> test cases. > >>>> > >>>>> } > >>>>> } else { > >>>>> > >>>>> > >>>>> in place of : > >>>>> > >>>>> if (a == 0) { > >>>>> if (operand[operandOffset] == 0) { > >>>>> function[0] = 1; > >>>>> double infinity = Double.POSITIVE_INFINITY; > >>>>> for (int i = 1; i < function.length; ++i) { > >>>>> infinity = -infinity; > >>>>> function[i] = infinity; > >>>>> } > >>>>> } > >>>>> } else { > >>>>> > >>>>> > >>>>> PS: I think you made a change to DSCompiler.pow too. If so, what > >> happens > >>>>> when a=0 & x!=0 in that function? > >>>> > >>>> No, I didn't change the other signatures of the pow function. So the > >>>> value should be OK (i.e. 1) but all derivatives, including the first > >>>> one, should be NaN. What the new function brings is a correct negetive > >>>> infinity first derivative at singularity point, better accuracy for > >>>> non-singular points, and possibly faster computation. > >>>> > >>>> best regards, > >>>> Luc > >>>> > >>>>> > >>>>> > >>>>> On Mon, Aug 26, 2013 at 12:38 AM, Luc Maisonobe <[hidden email]> > >>>> wrote: > >>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> Ajo Fod <[hidden email]> a écrit : > >>>>>>> Are you saying patched the code? Can you provide the link? > >>>>>> > >>>>>> I committed it in the development version. You just have to update > >> your > >>>>>> checked out copy from either the official > >>>>>> Apache subversion repository or the git mirror we talked about in a > >>>>>> previous thread. > >>>>>> > >>>>>> The new method is a static one called pow and taking a and x as > >>>> arguments > >>>>>> and returning a^x. Not to > >>>>>> Be confused with the non-static methods that take only the power as > >>>>>> argument (either int, double or > >>>>>> DerivativeStructure) and use the instance as the base to apply power > >> on. > >>>>>> > >>>>>> Best regards, > >>>>>> Luc > >>>>>> > >>>>>>> > >>>>>>> -Ajo > >>>>>>> > >>>>>>> > >>>>>>> On Sun, Aug 25, 2013 at 1:20 PM, Luc Maisonobe <[hidden email] > > > >>>>>>> wrote: > >>>>>>> > >>>>>>>> Le 24/08/2013 11:24, Luc Maisonobe a écrit : > >>>>>>>>> Le 23/08/2013 19:20, Ajo Fod a écrit : > >>>>>>>>>> Hello, > >>>>>>>>> > >>>>>>>>> Hi Ajo, > >>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> This shows one way of interpreting the derivative for strictly > +ve > >>>>>>>> numbers. > >>>>>>>>>> > >>>>>>>>>> public static void main(final String[] args) { > >>>>>>>>>> final double x = 1d; > >>>>>>>>>> DerivativeStructure dsA = new DerivativeStructure(1, 1, > 0, > >>>>>>> x); > >>>>>>>>>> System.out.println("Derivative of |a|^x wrt x"); > >>>>>>>>>> for (int p = 10; p < 21; p++) { > >>>>>>>>>> double a; > >>>>>>>>>> if (p < 20) { > >>>>>>>>>> a = 1d / Math.pow(2d, p); > >>>>>>>>>> } else { > >>>>>>>>>> a = 0d; > >>>>>>>>>> } > >>>>>>>>>> final DerivativeStructure a_ds = new > >>>>>>> DerivativeStructure(1, > >>>>>>>> 1, > >>>>>>>>>> a); > >>>>>>>>>> final DerivativeStructure out = a_ds.pow(dsA); > >>>>>>>>>> final double calc = (Math.pow(a, x + EPS) - > >>>>>>> Math.pow(a, x)) > >>>>>>>> / > >>>>>>>>>> EPS; > >>>>>>>>>> System.out.format("Derivative@%f=%f %f\n", a, > calc, > >>>>>>>>>> out.getPartialDerivative(new int[]{1})); > >>>>>>>>>> } > >>>>>>>>>> } > >>>>>>>>>> > >>>>>>>>>> At this point I"m explicitly substituting the rule that > >>>>>>>> derivative(|a|^x) = > >>>>>>>>>> 0 for |a|=0. > >>>>>>>>> > >>>>>>>>> Yes, but this fails for x = 0, as the limit of the finite > >>>>>>> difference is > >>>>>>>>> -infinity and not 0. > >>>>>>>>> > >>>>>>>>> You can build your own function which explicitly assumes a is > >>>>>>> constant > >>>>>>>>> and takes care of special values as follows: > >>>>>>>>> > >>>>>>>>> public static DerivativeStructure aToX(final double a, > >>>>>>>>> final DerivativeStructure > >>>>>>> x) { > >>>>>>>>> final double lnA = (a == 0 && x.getValue() == 0) ? > >>>>>>>>> Double.NEGATIVE_INFINITY : > >>>>>>>>> FastMath.log(a); > >>>>>>>>> final double[] function = new double[1 + x.getOrder()]; > >>>>>>>>> function[0] = FastMath.pow(a, x.getValue()); > >>>>>>>>> for (int i = 1; i < function.length; ++i) { > >>>>>>>>> function[i] = lnA * function[i - 1]; > >>>>>>>>> } > >>>>>>>>> return x.compose(function); > >>>>>>>>> } > >>>>>>>>> > >>>>>>>>> This will work and provides derivatives to any order for almost > any > >>>>>>>>> values of a and x, including a=0, x=1 as in your exemple, but > also > >>>>>>>>> slightly better for a=0, x=0. However, it still has an important > >>>>>>>>> drawback: it won't compute the n-th order derivative correctly > for > >>>>>>> a=0, > >>>>>>>>> x=0 and n > 1. It will provide NaN for these higher order > >>>>>>> derivatives > >>>>>>>>> instead of +/-infinity according to parity of n. > >>>>>>>> > >>>>>>>> I have added a similar function to the DerivativeStructure class > >>>>>>> (with > >>>>>>>> some errors above corrected). The main interesting property of > this > >>>>>>>> function is that it is more accurate that converting a to a > >>>>>>>> DerivativeStructure and using the general x^y function. It does > its > >>>>>>> best > >>>>>>>> to handle the special case, but as written above, this does NOT > work > >>>>>>> for > >>>>>>>> general combination (i.e. more than one variable or more than one > >>>>>>>> order). As soon as there is a combination, the derivative will > >>>>>>> involve > >>>>>>>> something like df/dx * dg/dy and as infinities and zeros are > >>>>>>> everywheren > >>>>>>>> NaN appears immediately for these partial derivatives. This cannot > >> be > >>>>>>>> avoided. > >>>>>>>> > >>>>>>>> If you stay away from the singularity, the function behaves > >>>>>>> correctly. > >>>>>>>> > >>>>>>>> best regards, > >>>>>>>> Luc > >>>>>>>> > >>>>>>>>> > >>>>>>>>> This is a known problem that we already encountered when dealing > >>>>>>> with > >>>>>>>>> rootN. Here is an extract of a comment in the test case > >>>>>>>>> testRootNSingularity, where similar NaN appears instead of +/- > >>>>>>> infinity. > >>>>>>>>> The dsZero instance in the comment is simple the x parameter of > the > >>>>>>>>> function, as a derivativeStructure with value 0.0 and depending > on > >>>>>>>>> itself (dsZero = new DerivativeStructure(1, maxOrder, 0, 0.0)): > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> // the following checks shows a LIMITATION of the current > >>>>>>> implementation > >>>>>>>>> // we have no way to tell dsZero is a pure linear variable x = 0 > >>>>>>>>> // we only say: "dsZero is a structure with value = 0.0, > >>>>>>>>> // first derivative = 1.0, second and higher derivatives = 0.0". > >>>>>>>>> // Function composition rule for second derivatives is: > >>>>>>>>> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) > >>>>>>>>> // when function f is the nth root and x = 0 we have: > >>>>>>>>> // f(0) = 0, f'(0) = +infinity, f''(0) = -infinity (and higher > >>>>>>>>> // derivatives keep switching between +infinity and -infinity) > >>>>>>>>> // so given that in our case dsZero represents g, we have g(x) = > 0, > >>>>>>>>> // g'(x) = 1 and g''(x) = 0 > >>>>>>>>> // applying the composition rules gives: > >>>>>>>>> // d2[f(g(x))]/dx2 = f''(g(x)) * [g'(x)]^2 + f'(g(x)) * g''(x) > >>>>>>>>> // = -infinity * 1^2 + +infinity * 0 > >>>>>>>>> // = -infinity + NaN > >>>>>>>>> // = NaN > >>>>>>>>> // if we knew dsZero is really the x variable and not the > identity > >>>>>>>>> // function applied to x, we would not have computed f'(g(x)) * > >>>>>>> g''(x) > >>>>>>>>> // and we would have found that the result was -infinity and not > >>>>>>> NaN > >>>>>>>>> > >>>>>>>>> Hope this helps > >>>>>>>>> Luc > >>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> Ajo. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Fri, Aug 23, 2013 at 9:39 AM, Luc Maisonobe > >>>>>>> <[hidden email] > >>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Hi Ajo, > >>>>>>>>>>> > >>>>>>>>>>> Le 23/08/2013 17:48, Ajo Fod a écrit : > >>>>>>>>>>>> Try this and I'm happy to explain if necessary: > >>>>>>>>>>>> > >>>>>>>>>>>> public class Derivative { > >>>>>>>>>>>> > >>>>>>>>>>>> public static void main(final String[] args) { > >>>>>>>>>>>> DerivativeStructure dsA = new DerivativeStructure(1, > 1, > >>>>>>> 0, > >>>>>>>> 1d); > >>>>>>>>>>>> System.out.println("Derivative of constant^x wrt x"); > >>>>>>>>>>>> for (int a = -3; a < 3; a++) { > >>>>>>>>>>> > >>>>>>>>>>> We have chosen the classical definition which implies c^x is > not > >>>>>>>> defined > >>>>>>>>>>> for real r and negative c. > >>>>>>>>>>> > >>>>>>>>>>> Our implementation is based on the decomposition c^r = exp(r * > >>>>>>> ln(c)), > >>>>>>>>>>> so the NaN comes from the logarithm when c <= 0. > >>>>>>>>>>> > >>>>>>>>>>> Noe also that as explained in the documentation here: > >>>>>>>>>>> < > >>>>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>> > >> > http://commons.apache.org/proper/commons-math/userguide/analysis.html#a4.7_Differentiation > >>>>>>>>>>>> , > >>>>>>>>>>> there are no concepts of "constants" and "variables" in this > >>>>>>> framework, > >>>>>>>>>>> so we cannot draw a line between c^r as seen as a univariate > >>>>>>> function > >>>>>>>> of > >>>>>>>>>>> r, or as a univariate function of c, or as a bivariate function > >>>>>>> of c > >>>>>>>> and > >>>>>>>>>>> r, or even as a pentavariate function of p1, p2, p3, p4, p5 > with > >>>>>>> both c > >>>>>>>>>>> and r being computed elsewhere from p1...p5. So we don't make > >>>>>>> special > >>>>>>>>>>> cases for the case c = 0 for example. > >>>>>>>>>>> > >>>>>>>>>>> Does this explanation make sense to you? > >>>>>>>>>>> > >>>>>>>>>>> best regards, > >>>>>>>>>>> Luc > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>> final DerivativeStructure a_ds = new > >>>>>>>> DerivativeStructure(1, > >>>>>>>>>>> 1, > >>>>>>>>>>>> a); > >>>>>>>>>>>> final DerivativeStructure out = a_ds.pow(dsA); > >>>>>>>>>>>> System.out.format("Derivative@%d=%f\n", a, > >>>>>>>>>>>> out.getPartialDerivative(new int[]{1})); > >>>>>>>>>>>> } > >>>>>>>>>>>> } > >>>>>>>>>>>> } > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On Fri, Aug 23, 2013 at 7:59 AM, Gilles > >>>>>>> <[hidden email] > >>>>>>>>>>>> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>>> On Fri, 23 Aug 2013 07:17:35 -0700, Ajo Fod wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Seems like the DerivativeCompiler returns NaN. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> IMHO it should return 0. > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> What should be 0? And Why? > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Is this worthy of an issue? > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> As is, no. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Gilles > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>> -Ajo > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>> > >>>>>>> > >>>> > >> > ------------------------------**------------------------------**--------- > >>>>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.org > < > >>>>>>>>>>> [hidden email]> > >>>>>>>>>>>>> For additional commands, e-mail: [hidden email] > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>> > --------------------------------------------------------------------- > >>>>>>>>>>> To unsubscribe, e-mail: [hidden email] > >>>>>>>>>>> For additional commands, e-mail: [hidden email] > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>> > --------------------------------------------------------------------- > >>>>>>>>> To unsubscribe, e-mail: [hidden email] > >>>>>>>>> For additional commands, e-mail: [hidden email] > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >> --------------------------------------------------------------------- > >>>>>>>> To unsubscribe, e-mail: [hidden email] > >>>>>>>> For additional commands, e-mail: [hidden email] > >>>>>>>> > >>>>>>>> > >>>>>> > >>>>>> > >>>>>> > --------------------------------------------------------------------- > >>>>>> To unsubscribe, e-mail: [hidden email] > >>>>>> For additional commands, e-mail: [hidden email] > >>>>>> > >>>>>> > >>>>> > >>>> > >>>> > >>>> --------------------------------------------------------------------- > >>>> To unsubscribe, e-mail: [hidden email] > >>>> For additional commands, e-mail: [hidden email] > >>>> > >>>> > >>> > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [hidden email] > >> For additional commands, e-mail: [hidden email] > >> > >> > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [hidden email] > For additional commands, e-mail: [hidden email] > > |
Free forum by Nabble | Edit this page |