Le 14/05/2015 01:29, Gilles a écrit :

> On Wed, 13 May 2015 19:43:09 +0200, Luc Maisonobe wrote:

>> Le 13/05/2015 11:35, Gilles a écrit :

>>> Hello.

>>>

>>> On Wed, 13 May 2015 11:12:16 +0200, Luc Maisonobe wrote:

>>>> Hi all,

>>>>

>>>> As Jenkins failures appeared again, I restarted work on FastMath.pow.

>>>> I'll keep you informed as I make progess.

>>>

>>> Did someone see those failures anywhere else other than on the

>>> identified machine used for CI?

>>

>> I did not reproduce them myselves, and I guess most of us didn't succeed

>> either (lets remember last year when Thomas and Sebb tried to fix the

>> exp function).

>>

>> However, the configuration is not a rare one (Ubuntu, Java 7, Sun JVM),

>> so I fear this bug, even if rare, can hit several people.

>

> How much damage can it do, apart from a false positive on a unit test?

> Isn't it rather the configuration that is faulty, rather than the CM code?

There are at least some (small) problems in the code.

I did reimplement the method, relying more heavily on bits

representation to detect special cases. This new implementation also

delegates all integer powers to pow(double, long), which handle these

cases directly with high accuracy multiplication and reciprocal.

This showed that our previous implementation was sometimes slightly

false. For example during one test in the gamma distribution suite, we

compute 147.2421875^-142. Our former implementation returned

1.3785386632060381807...e-308, and the new implementation returns

1.3785386632060391688...e-308 whereas the exact result should be

1.3785386632060390905...e-308. These are denormalized numbers, i.e.

their mantissas have less than 53 significant bits (here, they have 52

bits). The mantissas are:

former implementation 0x9E9AA8184555B

new implementation 0x9E9AA8184555D

exact result 0x9E9AA8184555C.D7655353C...

So we had more than 1.8 ulp error in this case and now we have 0.16 ulp

error.

Comparing all the calls done in our test suite for the full library, the

differences between the implementations are always in the integer power

cases (which is normal if you look at the code), they are quite rare and

generally only 1ulp away. The case above with 2ulps difference (from

-1.8 to +0.2) is rare.

Performances are equivalent to the previous ones, the new code is

neither faster nor slower. On my 7 years old desktop computer running

Linux, FastMath.pow takes about 0.69 as much time as StrictMath, as per

Gilles performance benchmarks.

Two junit tests did not pass anymore, and it took me a long time to

understand. The errors were really related to those 1 or 2 ulps differences.

The first test was for BOBYQA optimizer, with the DiffPow

function, which computes:

f(x) = abs(x0)^2 + abs(x1)^4 + abs(x2)^6 +

abs(x3)^8 +abs(x4)^10 + abs(x5)^12

The last term (power 12) is sometimes 1ulp away from the previous

implementation (and I think closer to exact result). The optimizer

does converge to the expected result (which is obviously all xi = 0),

but the number of iterations is extremelly large and highly dependent

on the accuracy. With former implementation, it took about 8000

iterations on my computer, for a limit set to 12000 for this test. With

the new implementation, I needed to raise the limit to 21000 to avoid

failing with a max number of iterations error. It really seems this test

is a stress test for this optimizer. The values computed are really

small, with denormalized numbers.

The second test was for gamma distribution, corresponding to a shape

parameter of 142. There was only one evaluation of pow that was

different, it is the one depicted above. The two ulps difference

implied the error in the non-overflow part of the distribution (with x

from 0 to 150) was between 1 and 6 ulps, with a mean at about 3.54 ulps

whereas with the previous implementation it was always 0 or 1 ulp with a

mean slightly below 0.5 ulp. So I had to raise the test tolerance to

pass. This seems strange to me. I tried to recompute the reference files

using the maxima script and increasing the accuracy from 64 digits to 96

digits, but got similar behaviour. The tests for other shape parameters

do all pass (values 1, 8, 10, 100 and 1000). Only value 142 fails. I

even don't know why this 142 value was used (if it were 42 I would have

taken it as a well known humourous reference). Perhaps it was already a

borderline case.

So here we are. A new implementation is available in the h10-builds

branch. It is as fast as the previous one, is probably slightly more

accurate for integer powers, but makes two tests fail unless we adjust

some test parameters or tolerances. The H10 host seem to not trigger the

JIT optimization bug anymore with this implementation.

What do we do? Do we put this back to the master branch?

best regards,

Luc

>

> Best,

> Gilles

>

>

> ---------------------------------------------------------------------

> To unsubscribe, e-mail:

[hidden email]
> For additional commands, e-mail:

[hidden email]
>

>

>

---------------------------------------------------------------------

To unsubscribe, e-mail:

[hidden email]
For additional commands, e-mail:

[hidden email]