[Math] "NaturalRankingTest"

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[Math] "NaturalRankingTest"

Gilles Sadowski
Hello.

Test method "testNaNsFixedTiesRandom" (in the above unit test class)
can pass or fail depending on the seed value.

When seed is 1000, the test passes.

When seed is 1001, this failure occurs:
  Elements at index 1 differ.  expected = 3.0 observed = 4.0
  Elements at index 4 differ.  expected = 3.0 observed = 2.0

When seed is 1002, this failure occurs:
  Elements at index 1 differ.  expected = 3.0 observed = 2.0
  Elements at index 4 differ.  expected = 3.0 observed = 4.0
  Elements at index 8 differ.  expected = 2.0 observed = 3.0

When seed is 1003, this failure occurs:
  Elements at index 8 differ.  expected = 2.0 observed = 4.0

When seed is 1004, this failure occurs:
  Elements at index 1 differ.  expected = 3.0 observed = 2.0

When seed is 1005, this failure occurs:
  Elements at index 4 differ.  expected = 3.0 observed = 2.0
  Elements at index 8 differ.  expected = 2.0 observed = 3.0

When seed is 1006, this failure occurs:
  Elements at index 1 differ.  expected = 3.0 observed = 4.0
  Elements at index 4 differ.  expected = 3.0 observed = 4.0
  Elements at index 8 differ.  expected = 2.0 observed = 3.0

When seed is 1007, this failure occurs:
  Elements at index 1 differ.  expected = 3.0 observed = 2.0
  Elements at index 4 differ.  expected = 3.0 observed = 4.0

When seed is 1008, this failure occurs:
  Elements at index 1 differ.  expected = 3.0 observed = 2.0
  Elements at index 8 differ.  expected = 2.0 observed = 4.0

When seed is 1009, this failure occurs:
  Elements at index 1 differ.  expected = 3.0 observed = 2.0

When seed is 1010, this failure occurs:
  Elements at index 1 differ.  expected = 3.0 observed = 4.0
  Elements at index 4 differ.  expected = 3.0 observed = 2.0
  Elements at index 8 differ.  expected = 2.0 observed = 3.0

Also fails when seed is
  1011
  1012
  1013
  1014
  1015
  1016
  1017
  112351341
  -932524

Is that expected behaviour?
It does not look trivial to understand why one should trust a
test that fails most of the time...


Regards,
Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] "NaturalRankingTest"

Gilles Sadowski
On Sat, 14 May 2016 02:47:18 +0200, Gilles wrote:

> Hello.
>
> Test method "testNaNsFixedTiesRandom" (in the above unit test class)
> can pass or fail depending on the seed value.
>
> When seed is 1000, the test passes.
>
> When seed is 1001, this failure occurs:
>  Elements at index 1 differ.  expected = 3.0 observed = 4.0
>  Elements at index 4 differ.  expected = 3.0 observed = 2.0
>
> When seed is 1002, this failure occurs:
>  Elements at index 1 differ.  expected = 3.0 observed = 2.0
>  Elements at index 4 differ.  expected = 3.0 observed = 4.0
>  Elements at index 8 differ.  expected = 2.0 observed = 3.0
>
> When seed is 1003, this failure occurs:
>  Elements at index 8 differ.  expected = 2.0 observed = 4.0
>
> When seed is 1004, this failure occurs:
>  Elements at index 1 differ.  expected = 3.0 observed = 2.0
>
> When seed is 1005, this failure occurs:
>  Elements at index 4 differ.  expected = 3.0 observed = 2.0
>  Elements at index 8 differ.  expected = 2.0 observed = 3.0
>
> When seed is 1006, this failure occurs:
>  Elements at index 1 differ.  expected = 3.0 observed = 4.0
>  Elements at index 4 differ.  expected = 3.0 observed = 4.0
>  Elements at index 8 differ.  expected = 2.0 observed = 3.0
>
> When seed is 1007, this failure occurs:
>  Elements at index 1 differ.  expected = 3.0 observed = 2.0
>  Elements at index 4 differ.  expected = 3.0 observed = 4.0
>
> When seed is 1008, this failure occurs:
>  Elements at index 1 differ.  expected = 3.0 observed = 2.0
>  Elements at index 8 differ.  expected = 2.0 observed = 4.0
>
> When seed is 1009, this failure occurs:
>  Elements at index 1 differ.  expected = 3.0 observed = 2.0
>
> When seed is 1010, this failure occurs:
>  Elements at index 1 differ.  expected = 3.0 observed = 4.0
>  Elements at index 4 differ.  expected = 3.0 observed = 2.0
>  Elements at index 8 differ.  expected = 2.0 observed = 3.0
>
> Also fails when seed is
>  1011
>  1012
>  1013
>  1014
>  1015
>  1016
>  1017
>  112351341
>  -932524
>
> Is that expected behaviour?
> It does not look trivial to understand why one should trust a
> test that fails most of the time...

The test fails for ~96.3% of the possible seed values.

Cause for such a behaviour can be:
  1. unit test is buggy
  2. code being tested is buggy

Are there other possible causes?


Regards,
Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] "NaturalRankingTest"

Matt Sicker
3. Hardware bugs? Rare but possible.

On 14 May 2016 at 08:01, Gilles <[hidden email]> wrote:

> On Sat, 14 May 2016 02:47:18 +0200, Gilles wrote:
>
>> Hello.
>>
>> Test method "testNaNsFixedTiesRandom" (in the above unit test class)
>> can pass or fail depending on the seed value.
>>
>> When seed is 1000, the test passes.
>>
>> When seed is 1001, this failure occurs:
>>  Elements at index 1 differ.  expected = 3.0 observed = 4.0
>>  Elements at index 4 differ.  expected = 3.0 observed = 2.0
>>
>> When seed is 1002, this failure occurs:
>>  Elements at index 1 differ.  expected = 3.0 observed = 2.0
>>  Elements at index 4 differ.  expected = 3.0 observed = 4.0
>>  Elements at index 8 differ.  expected = 2.0 observed = 3.0
>>
>> When seed is 1003, this failure occurs:
>>  Elements at index 8 differ.  expected = 2.0 observed = 4.0
>>
>> When seed is 1004, this failure occurs:
>>  Elements at index 1 differ.  expected = 3.0 observed = 2.0
>>
>> When seed is 1005, this failure occurs:
>>  Elements at index 4 differ.  expected = 3.0 observed = 2.0
>>  Elements at index 8 differ.  expected = 2.0 observed = 3.0
>>
>> When seed is 1006, this failure occurs:
>>  Elements at index 1 differ.  expected = 3.0 observed = 4.0
>>  Elements at index 4 differ.  expected = 3.0 observed = 4.0
>>  Elements at index 8 differ.  expected = 2.0 observed = 3.0
>>
>> When seed is 1007, this failure occurs:
>>  Elements at index 1 differ.  expected = 3.0 observed = 2.0
>>  Elements at index 4 differ.  expected = 3.0 observed = 4.0
>>
>> When seed is 1008, this failure occurs:
>>  Elements at index 1 differ.  expected = 3.0 observed = 2.0
>>  Elements at index 8 differ.  expected = 2.0 observed = 4.0
>>
>> When seed is 1009, this failure occurs:
>>  Elements at index 1 differ.  expected = 3.0 observed = 2.0
>>
>> When seed is 1010, this failure occurs:
>>  Elements at index 1 differ.  expected = 3.0 observed = 4.0
>>  Elements at index 4 differ.  expected = 3.0 observed = 2.0
>>  Elements at index 8 differ.  expected = 2.0 observed = 3.0
>>
>> Also fails when seed is
>>  1011
>>  1012
>>  1013
>>  1014
>>  1015
>>  1016
>>  1017
>>  112351341
>>  -932524
>>
>> Is that expected behaviour?
>> It does not look trivial to understand why one should trust a
>> test that fails most of the time...
>>
>
> The test fails for ~96.3% of the possible seed values.
>
> Cause for such a behaviour can be:
>  1. unit test is buggy
>  2. code being tested is buggy
>
> Are there other possible causes?
>
>
>
> Regards,
> Gilles
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


--
Matt Sicker <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: [Math] "NaturalRankingTest"

Gilles Sadowski
Hi.

On Sat, 14 May 2016 15:05:54 -0500, Matt Sicker wrote:
> 3. Hardware bugs? Rare but possible.

The behaviour is the same on two different machines.

As it is blocking MATH-1341, I can do one the following:
1. annotate the test with "@Ignore"
2. find a seed that will make the test pass
3. remove the assertion that fails

At first sight, I'd think that the former is a little safer
as it leaves a trace that something might not work as intended
(or if it is expected, that the reason for it to fail so often
must be documented).

Please advise.

Regards,
Gilles

> On 14 May 2016 at 08:01, Gilles <[hidden email]> wrote:
>
>> On Sat, 14 May 2016 02:47:18 +0200, Gilles wrote:
>>
>>> Hello.
>>>
>>> Test method "testNaNsFixedTiesRandom" (in the above unit test
>>> class)
>>> can pass or fail depending on the seed value.
>>>
>>> When seed is 1000, the test passes.
>>>
>>> When seed is 1001, this failure occurs:
>>>  Elements at index 1 differ.  expected = 3.0 observed = 4.0
>>>  Elements at index 4 differ.  expected = 3.0 observed = 2.0
>>>
>>> When seed is 1002, this failure occurs:
>>>  Elements at index 1 differ.  expected = 3.0 observed = 2.0
>>>  Elements at index 4 differ.  expected = 3.0 observed = 4.0
>>>  Elements at index 8 differ.  expected = 2.0 observed = 3.0
>>>
>>> When seed is 1003, this failure occurs:
>>>  Elements at index 8 differ.  expected = 2.0 observed = 4.0
>>>
>>> When seed is 1004, this failure occurs:
>>>  Elements at index 1 differ.  expected = 3.0 observed = 2.0
>>>
>>> When seed is 1005, this failure occurs:
>>>  Elements at index 4 differ.  expected = 3.0 observed = 2.0
>>>  Elements at index 8 differ.  expected = 2.0 observed = 3.0
>>>
>>> When seed is 1006, this failure occurs:
>>>  Elements at index 1 differ.  expected = 3.0 observed = 4.0
>>>  Elements at index 4 differ.  expected = 3.0 observed = 4.0
>>>  Elements at index 8 differ.  expected = 2.0 observed = 3.0
>>>
>>> When seed is 1007, this failure occurs:
>>>  Elements at index 1 differ.  expected = 3.0 observed = 2.0
>>>  Elements at index 4 differ.  expected = 3.0 observed = 4.0
>>>
>>> When seed is 1008, this failure occurs:
>>>  Elements at index 1 differ.  expected = 3.0 observed = 2.0
>>>  Elements at index 8 differ.  expected = 2.0 observed = 4.0
>>>
>>> When seed is 1009, this failure occurs:
>>>  Elements at index 1 differ.  expected = 3.0 observed = 2.0
>>>
>>> When seed is 1010, this failure occurs:
>>>  Elements at index 1 differ.  expected = 3.0 observed = 4.0
>>>  Elements at index 4 differ.  expected = 3.0 observed = 2.0
>>>  Elements at index 8 differ.  expected = 2.0 observed = 3.0
>>>
>>> Also fails when seed is
>>>  1011
>>>  1012
>>>  1013
>>>  1014
>>>  1015
>>>  1016
>>>  1017
>>>  112351341
>>>  -932524
>>>
>>> Is that expected behaviour?
>>> It does not look trivial to understand why one should trust a
>>> test that fails most of the time...
>>>
>>
>> The test fails for ~96.3% of the possible seed values.
>>
>> Cause for such a behaviour can be:
>>  1. unit test is buggy
>>  2. code being tested is buggy
>>
>> Are there other possible causes?
>>
>>
>>
>> Regards,
>> Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] "NaturalRankingTest"

Gilles Sadowski
On Sun, 15 May 2016 20:48:22 +0200, Gilles wrote:

> Hi.
>
> On Sat, 14 May 2016 15:05:54 -0500, Matt Sicker wrote:
>> 3. Hardware bugs? Rare but possible.
>
> The behaviour is the same on two different machines.
>
> As it is blocking MATH-1341, I can do one the following:
> 1. annotate the test with "@Ignore"
> 2. find a seed that will make the test pass
> 3. remove the assertion that fails
>
> At first sight, I'd think that the former is a little safer
> as it leaves a trace that something might not work as intended
> (or if it is expected, that the reason for it to fail so often
> must be documented).
>
> Please advise.
>
> Regards,
> Gilles
>
>> On 14 May 2016 at 08:01, Gilles <[hidden email]>
>> wrote:
>>
>>> On Sat, 14 May 2016 02:47:18 +0200, Gilles wrote:
>>>
>>>> Hello.
>>>>
>>>> Test method "testNaNsFixedTiesRandom" (in the above unit test
>>>> class)
>>>> can pass or fail depending on the seed value.
>>>>
>>>> When seed is 1000, the test passes.
>>>>
>>>> When seed is 1001, this failure occurs:
>>>>  Elements at index 1 differ.  expected = 3.0 observed = 4.0
>>>>  Elements at index 4 differ.  expected = 3.0 observed = 2.0
>>>>
>>>> When seed is 1002, this failure occurs:
>>>>  Elements at index 1 differ.  expected = 3.0 observed = 2.0
>>>>  Elements at index 4 differ.  expected = 3.0 observed = 4.0
>>>>  Elements at index 8 differ.  expected = 2.0 observed = 3.0
>>>>
>>>> When seed is 1003, this failure occurs:
>>>>  Elements at index 8 differ.  expected = 2.0 observed = 4.0
>>>>
>>>> When seed is 1004, this failure occurs:
>>>>  Elements at index 1 differ.  expected = 3.0 observed = 2.0
>>>>
>>>> When seed is 1005, this failure occurs:
>>>>  Elements at index 4 differ.  expected = 3.0 observed = 2.0
>>>>  Elements at index 8 differ.  expected = 2.0 observed = 3.0
>>>>
>>>> When seed is 1006, this failure occurs:
>>>>  Elements at index 1 differ.  expected = 3.0 observed = 4.0
>>>>  Elements at index 4 differ.  expected = 3.0 observed = 4.0
>>>>  Elements at index 8 differ.  expected = 2.0 observed = 3.0
>>>>
>>>> When seed is 1007, this failure occurs:
>>>>  Elements at index 1 differ.  expected = 3.0 observed = 2.0
>>>>  Elements at index 4 differ.  expected = 3.0 observed = 4.0
>>>>
>>>> When seed is 1008, this failure occurs:
>>>>  Elements at index 1 differ.  expected = 3.0 observed = 2.0
>>>>  Elements at index 8 differ.  expected = 2.0 observed = 4.0
>>>>
>>>> When seed is 1009, this failure occurs:
>>>>  Elements at index 1 differ.  expected = 3.0 observed = 2.0
>>>>
>>>> When seed is 1010, this failure occurs:
>>>>  Elements at index 1 differ.  expected = 3.0 observed = 4.0
>>>>  Elements at index 4 differ.  expected = 3.0 observed = 2.0
>>>>  Elements at index 8 differ.  expected = 2.0 observed = 3.0
>>>>
>>>> Also fails when seed is
>>>>  1011
>>>>  1012
>>>>  1013
>>>>  1014
>>>>  1015
>>>>  1016
>>>>  1017
>>>>  112351341
>>>>  -932524
>>>>
>>>> Is that expected behaviour?
>>>> It does not look trivial to understand why one should trust a
>>>> test that fails most of the time...
>>>>
>>>
>>> The test fails for ~96.3% of the possible seed values.

Method "testNaNsFixedTiesRandom" actually contains 6 assertions.
I tested 10000000 ("long") seed values.
Only the following

seed = 1000
seed = 1468109
seed = 1539722
seed = 1831917
seed = 2497119
seed = 4063034
seed = 4291147
seed = 4571858

would lead all assertions to succeed.
Failure rate for this unit test is thus higher than 99.999992%.


Gilles

>>>
>>> Cause for such a behaviour can be:
>>>  1. unit test is buggy
>>>  2. code being tested is buggy
>>>
>>> Are there other possible causes?
>>>
>>>
>>>
>>> Regards,
>>> Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]