[RNG] Scope of "Commons RNG"

classic Classic list List threaded Threaded
64 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

[RNG] Scope of "Commons RNG"

Gilles Sadowski
Hello.

This is a post to ask about what we want "Commons RNG"
to be (as a service to the users).

In the Wikipedia pages referred to in the following
reports
   https://issues.apache.org/jira/browse/RNG-16
   https://issues.apache.org/jira/browse/RNG-17
the take-away message (IIUC) is that LCG and LFG are
almost never to be used.[1]

If we want "Commons RNG" to be a repository of all
generators that exist out there, irrespective of their
known weaknesses, it's fine; but we should be careful to
not let casual users just pick one of the implementations
on the premise that the library focuses on high quality
generators.

On the other hand, we could be wary of adding code[2]
that we'd then recommend to not use...

If, from some POV, it is deemed useful to have those,
the scope and overview of the component should mention,
prominently, the "caveat".[3]

I have no issue with adding any new implementation,[4]
on the conditions that it comes with
  1. a unit test where the output (say, a few hundred
     numbers) of "Commons RNG" is compared against a
     "reference" implementation,[5]
  2. the outputs of the "RandomStressTester"[6] piping
     from the "Dieharder" and "TU01/BigCrush" actual
     stress test suites.[7]

We should add a note to that effect somewhere in the
documentation, perhaps in "CONTRIBUTING.md" (?).

Regards,
Gilles

[1] Emmanuel, if you don't mind, we'd thus set the JIRA
     issue "type" to "wish" rather than "improvement".
[2] https://xkcd.com/221/
[3] Up to now, I had assumed that no known-to-be-bad
     generators would be part of "Commons RNG" (except
     "JDK", for reference purposes).
[4] It is not a problem to wait another couple of weeks
     for the additional code, before releasing 1.0.
[5] I.e. _not_ a "pre-run" of the same implementation!
[6] Source is in "src/userguide/java".
[7] Those software have to be installed separately.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [RNG] Scope of "Commons RNG"

Emmanuel Bourg-3
Le 21/09/2016 à 14:46, Gilles a écrit :

> If we want "Commons RNG" to be a repository of all
> generators that exist out there, irrespective of their
> known weaknesses, it's fine; but we should be careful to
> not let casual users just pick one of the implementations
> on the premise that the library focuses on high quality
> generators.

I think it's fine to have weaker implementations as long as they are
properly documented with the necessary warnings. There aren't that many
algorithms anyway, we'll quickly have the interesting ones.


> I have no issue with adding any new implementation,[4]
> on the conditions that it comes with
>  1. a unit test where the output (say, a few hundred
>     numbers) of "Commons RNG" is compared against a
>     "reference" implementation,[5]
>  2. the outputs of the "RandomStressTester"[6] piping
>     from the "Dieharder" and "TU01/BigCrush" actual
>     stress test suites.[7]

Sounds fair


> [1] Emmanuel, if you don't mind, we'd thus set the JIRA
>     issue "type" to "wish" rather than "improvement".

As you want, that doesn't make a big difference. It could even qualify
for the "New Feature" type.

> [2] https://xkcd.com/221/

Now I'm tempted to implement a XKCDRandomGenerator just for fun :)

> [3] Up to now, I had assumed that no known-to-be-bad
>     generators would be part of "Commons RNG" (except
>     "JDK", for reference purposes).

Note that as time goes some generators will be supplanted by better
ones, so Commons RNG will inevitably contain implementations weaker than
the then current state of the art.

> [4] It is not a problem to wait another couple of weeks
>     for the additional code, before releasing 1.0.

Ok, I can try implementing LCGs then.

Emmanuel Bourg


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[RNG] New implementations before release of 1.0

Gilles Sadowski
Hi.

Release of version 1.0 will be delayed for another two weeks,
waiting for additional implementations.
See:
   https://issues.apache.org/jira/browse/RNG-16
   https://issues.apache.org/jira/browse/RNG-17

If you intend to try and implement them, please assign the
issue to yourself.
Patches for other algorithms are welcome too.

Please note that any new implementation must come with
(1) a unit test where the output (say, a few hundred
     numbers) of "Commons RNG" is compared against a
     "reference" implementation,
(2) the outputs of the "RandomStressTester" piping
     from the "Dieharder" and "TU01/BigCrush" actual
     stress test suites.

If you provide the Java code and (1), I can take take care
of (2), since I have the software installed already.[1]


Regards,
Gilles

[1] Please take into account that it will take about 32
     hours to run "BigCrush".
     Hence, I'll start all the runs (3 "Dieharder" and 3
     "BigCrush", per new implementation) in _one_ go.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [RNG] New implementations before release of 1.0

Jörg Schaible-5
Gilles wrote:

> Hi.
>
> Release of version 1.0 will be delayed for another two weeks,
> waiting for additional implementations.

Then, please, cancel the still running vote also ;-)

Cheers,
Jörg


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [RNG] New implementations before release of 1.0

Gilles Sadowski
On Thu, 22 Sep 2016 14:29:37 +0200, Jörg Schaible wrote:
> Gilles wrote:
>
>> Hi.
>>
>> Release of version 1.0 will be delayed for another two weeks,
>> waiting for additional implementations.
>
> Then, please, cancel the still running vote also ;-)

I'll do (the deadline was over anyways).

Gilles

>
> Cheers,
> Jörg
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [RNG] New implementations before release of 1.0

Jörg Schaible-5
Hi Gilles,

Gilles wrote:

> On Thu, 22 Sep 2016 14:29:37 +0200, Jörg Schaible wrote:
>> Gilles wrote:
>>
>>> Hi.
>>>
>>> Release of version 1.0 will be delayed for another two weeks,
>>> waiting for additional implementations.
>>
>> Then, please, cancel the still running vote also ;-)
>
> I'll do (the deadline was over anyways).

The deadline just means that you wait to close the vote "at least" until.

Cheers,
Jörg



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: [RNG] New implementations before release of 1.0

Dennis E. Hamilton


> -----Original Message-----
> From: Jörg Schaible [mailto:[hidden email]]
> Sent: Thursday, September 22, 2016 08:50
> To: [hidden email]
> Subject: Re: [RNG] New implementations before release of 1.0
>
> Hi Gilles,
>
> Gilles wrote:
>
> > On Thu, 22 Sep 2016 14:29:37 +0200, Jörg Schaible wrote:
> >> Gilles wrote:
> >>
> >>> Hi.
> >>>
> >>> Release of version 1.0 will be delayed for another two weeks,
> >>> waiting for additional implementations.
> >>
> >> Then, please, cancel the still running vote also ;-)
> >
> > I'll do (the deadline was over anyways).
>
> The deadline just means that you wait to close the vote "at least"
> until.
[orcmid]

I think it is the vote will be *open* "at least until."  It is common to leave a [VOTE] open either because of discussion or to see if an additional ballot is forthcoming.  I've started using "ending no earlier than <some-UTC-datetime>" in [VOTE]s I run.

I suppose the way [VOTE] threads end clearly is either with a [CANCEL] report or a [RESULT] report.

>
> Cheers,
> Jörg
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [RNG] New implementations before release of 1.0

Gilles Sadowski
In reply to this post by Jörg Schaible-5
On Thu, 22 Sep 2016 17:50:23 +0200, Jörg Schaible wrote:

> Hi Gilles,
>
> Gilles wrote:
>
>> On Thu, 22 Sep 2016 14:29:37 +0200, Jörg Schaible wrote:
>>> Gilles wrote:
>>>
>>>> Hi.
>>>>
>>>> Release of version 1.0 will be delayed for another two weeks,
>>>> waiting for additional implementations.
>>>
>>> Then, please, cancel the still running vote also ;-)
>>
>> I'll do (the deadline was over anyways).
>
> The deadline just means that you wait to close the vote "at least"
> until.

I noticed that there are different formulation, with different
meanings.

But for that vote, the text said:
---CUT---
This vote will close in 72 hours, [...]
---CUT---

Regards,
Gilles

>
> Cheers,
> Jörg
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [RNG] Scope of "Commons RNG"

Gilles Sadowski
In reply to this post by Emmanuel Bourg-3
Hi.

Reviving this thread following a new feature request:
   https://issues.apache.org/jira/browse/RNG-19

IMHO, the request departs from the initial goal (and, hence
the design "requirements" on which the current code is based).

As I suggested previously on this list, I'm going to request
a new "git" repository for implementing utilities based on
random generators.

First candidates are:
* Non-uniform deviates (i.e. the samplers now defined in
   Commons Math's "o.a.c.math4.distribution" package),
* Shuffling algorithm (cf. Commons Math's "o.a.c.m.MathArrays"),
* Data generation (e.g. random strings, currently defined in
   Commons Math's "o.a.c.m.random.RandomUtils"),
* Syntactic sugar (e.g. strongly-type factory methods, as
   suggested by Emmanuel during the RC1 vote),
* Bridge/wrappers (as suggested by Emmanuel in RNG-19, on JIRA).

Thus, "RNG Utils" would have a much less tightened scope, allowing
for experimenting with user-requested codes.[1][2]

Independently, I'm also wondering about removing the "JDK" element
from the "o.a.c.rng.RandomSource" enum.
Rationale is that this RNG should not be used.[3]
Once the LCG family of generators is available[4], the algorithm
provided by the JDK can be emulated.[5]

WDYT?


Regards,
Gilles

[1] Thus avoiding any impact on the stability of "Commons RNG" as a
     simple, no-dependency, repository of PRNG algorithms ported to
     Java (and usable as such).
[2] I'd also suggest to copy/move to that new component the related
     utilities currently defined in Commons Lang.
[3] Users that _want_ to use "java.util.Random" for some reason will
     probably be better off using it directly.
[4] https://issues.apache.org/jira/browse/RNG-16
[5] To be confirmed by a unit test...

On Wed, 21 Sep 2016 17:27:35 +0200, Emmanuel Bourg wrote:

> Le 21/09/2016 à 14:46, Gilles a écrit :
>
>> If we want "Commons RNG" to be a repository of all
>> generators that exist out there, irrespective of their
>> known weaknesses, it's fine; but we should be careful to
>> not let casual users just pick one of the implementations
>> on the premise that the library focuses on high quality
>> generators.
>
> I think it's fine to have weaker implementations as long as they are
> properly documented with the necessary warnings. There aren't that
> many
> algorithms anyway, we'll quickly have the interesting ones.
>
>
>> I have no issue with adding any new implementation,[4]
>> on the conditions that it comes with
>>  1. a unit test where the output (say, a few hundred
>>     numbers) of "Commons RNG" is compared against a
>>     "reference" implementation,[5]
>>  2. the outputs of the "RandomStressTester"[6] piping
>>     from the "Dieharder" and "TU01/BigCrush" actual
>>     stress test suites.[7]
>
> Sounds fair
>
>
>> [1] Emmanuel, if you don't mind, we'd thus set the JIRA
>>     issue "type" to "wish" rather than "improvement".
>
> As you want, that doesn't make a big difference. It could even
> qualify
> for the "New Feature" type.
>
>> [2] https://xkcd.com/221/
>
> Now I'm tempted to implement a XKCDRandomGenerator just for fun :)
>
>> [3] Up to now, I had assumed that no known-to-be-bad
>>     generators would be part of "Commons RNG" (except
>>     "JDK", for reference purposes).
>
> Note that as time goes some generators will be supplanted by better
> ones, so Commons RNG will inevitably contain implementations weaker
> than
> the then current state of the art.
>
>> [4] It is not a problem to wait another couple of weeks
>>     for the additional code, before releasing 1.0.
>
> Ok, I can try implementing LCGs then.
>
> Emmanuel Bourg
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [RNG] Scope of "Commons RNG"

Brent Worden-2
I would keep the JDK source.  My reasoning being:

1. Users that want to use java.util.Random would not be able to use some or
all of the RNG Utils code as the later will probably relay on RandomSource
instances.

2. With LCGs the current Random implementation provided by Oracle could
possibly be emulated by commons-rng.  However, there is no guarantee the
current implementation is used by all JDKs.  Also, there a no guarantee the
Oracle implement doesn't change in future versions of their JDK and change
to something that can not be emulated by commons-rng.

So, to give users more flexibility in choosing a RandomSource and to be
resilient to Random change, I feel the JDK source is beneficial.

Brent

On Mon, Sep 26, 2016 at 11:33 AM, Gilles <[hidden email]>
wrote:

> Hi.
>
> Reviving this thread following a new feature request:
>   https://issues.apache.org/jira/browse/RNG-19
>
> IMHO, the request departs from the initial goal (and, hence
> the design "requirements" on which the current code is based).
>
> As I suggested previously on this list, I'm going to request
> a new "git" repository for implementing utilities based on
> random generators.
>
> First candidates are:
> * Non-uniform deviates (i.e. the samplers now defined in
>   Commons Math's "o.a.c.math4.distribution" package),
> * Shuffling algorithm (cf. Commons Math's "o.a.c.m.MathArrays"),
> * Data generation (e.g. random strings, currently defined in
>   Commons Math's "o.a.c.m.random.RandomUtils"),
> * Syntactic sugar (e.g. strongly-type factory methods, as
>   suggested by Emmanuel during the RC1 vote),
> * Bridge/wrappers (as suggested by Emmanuel in RNG-19, on JIRA).
>
> Thus, "RNG Utils" would have a much less tightened scope, allowing
> for experimenting with user-requested codes.[1][2]
>
> Independently, I'm also wondering about removing the "JDK" element
> from the "o.a.c.rng.RandomSource" enum.
> Rationale is that this RNG should not be used.[3]
> Once the LCG family of generators is available[4], the algorithm
> provided by the JDK can be emulated.[5]
>
> WDYT?
>
>
> Regards,
> Gilles
>
> [1] Thus avoiding any impact on the stability of "Commons RNG" as a
>     simple, no-dependency, repository of PRNG algorithms ported to
>     Java (and usable as such).
> [2] I'd also suggest to copy/move to that new component the related
>     utilities currently defined in Commons Lang.
> [3] Users that _want_ to use "java.util.Random" for some reason will
>     probably be better off using it directly.
> [4] https://issues.apache.org/jira/browse/RNG-16
> [5] To be confirmed by a unit test...
>
>
> On Wed, 21 Sep 2016 17:27:35 +0200, Emmanuel Bourg wrote:
>
>> Le 21/09/2016 à 14:46, Gilles a écrit :
>>
>> If we want "Commons RNG" to be a repository of all
>>> generators that exist out there, irrespective of their
>>> known weaknesses, it's fine; but we should be careful to
>>> not let casual users just pick one of the implementations
>>> on the premise that the library focuses on high quality
>>> generators.
>>>
>>
>> I think it's fine to have weaker implementations as long as they are
>> properly documented with the necessary warnings. There aren't that many
>> algorithms anyway, we'll quickly have the interesting ones.
>>
>>
>> I have no issue with adding any new implementation,[4]
>>> on the conditions that it comes with
>>>  1. a unit test where the output (say, a few hundred
>>>     numbers) of "Commons RNG" is compared against a
>>>     "reference" implementation,[5]
>>>  2. the outputs of the "RandomStressTester"[6] piping
>>>     from the "Dieharder" and "TU01/BigCrush" actual
>>>     stress test suites.[7]
>>>
>>
>> Sounds fair
>>
>>
>> [1] Emmanuel, if you don't mind, we'd thus set the JIRA
>>>     issue "type" to "wish" rather than "improvement".
>>>
>>
>> As you want, that doesn't make a big difference. It could even qualify
>> for the "New Feature" type.
>>
>> [2] https://xkcd.com/221/
>>>
>>
>> Now I'm tempted to implement a XKCDRandomGenerator just for fun :)
>>
>> [3] Up to now, I had assumed that no known-to-be-bad
>>>     generators would be part of "Commons RNG" (except
>>>     "JDK", for reference purposes).
>>>
>>
>> Note that as time goes some generators will be supplanted by better
>> ones, so Commons RNG will inevitably contain implementations weaker than
>> the then current state of the art.
>>
>> [4] It is not a problem to wait another couple of weeks
>>>     for the additional code, before releasing 1.0.
>>>
>>
>> Ok, I can try implementing LCGs then.
>>
>> Emmanuel Bourg
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [RNG] Scope of "Commons RNG"

Emmanuel Bourg-3
In reply to this post by Gilles Sadowski
Le 26/09/2016 à 18:33, Gilles a écrit :

> As I suggested previously on this list, I'm going to request
> a new "git" repository for implementing utilities based on
> random generators.

I suggest waiting until RNG 1.0 is out and we have a clearer view of the
scope of the components. We can still experiment with "rng-tools" on a
separate branch or with a Maven module in the same repository.


> First candidates are:
> * Non-uniform deviates (i.e. the samplers now defined in
>   Commons Math's "o.a.c.math4.distribution" package),

I agree this doesn't belong to commons-rng, but I'm not convinced it
would fit a commons-rng-tools component. Maybe a component more targeted
toward statistic algorithms?


> * Shuffling algorithm (cf. Commons Math's "o.a.c.m.MathArrays"),

This should go in the ArrayUtils class of commons-lang, with a
java.util.Random parameter.


> * Data generation (e.g. random strings, currently defined in
>   Commons Math's "o.a.c.m.random.RandomUtils"),

I'm not familiar with this, it looks linked to the distribution stuff.


> * Syntactic sugar (e.g. strongly-type factory methods, as
>   suggested by Emmanuel during the RC1 vote),

Having two different factories in two different components for the same
objects is odd. I prefer only one factory.


> * Bridge/wrappers (as suggested by Emmanuel in RNG-19, on JIRA).

A mere random generator should be in commons-rng. And the
java.util.Random bridge also belongs to commons-rng in my opinion,
that's similar to the methods converting from/to java.util.Properties in
commons-configuration.

Emmanuel Bourg


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [RNG] Scope of "Commons RNG"

Gilles Sadowski
On Tue, 27 Sep 2016 00:37:26 +0200, Emmanuel Bourg wrote:

> Le 26/09/2016 à 18:33, Gilles a écrit :
>
>> As I suggested previously on this list, I'm going to request
>> a new "git" repository for implementing utilities based on
>> random generators.
>
> I suggest waiting until RNG 1.0 is out and we have a clearer view of
> the
> scope of the components. We can still experiment with "rng-tools" on
> a
> separate branch or with a Maven module in the same repository.

My point is that we don't have a clear view.

An additional module in Commons RNG will not help, as was noticed
some time ago: it won't be possible/allowed to release the modules
separately.

I don't want to have to release a major version of Commons RNG
just because accessory tools requires it.
Users should not wonder with each new version whether there was
a bug or a change in the core functionality.

Morevover, as an example, you don't seem to like my choice of
factory.  Adding layers and tools in the same component will
multiply the discussions about taste and mileage to no end.
As there was in CM.  As it happens now.

Another git repository will avoid that temptation.

>> First candidates are:
>> * Non-uniform deviates (i.e. the samplers now defined in
>>   Commons Math's "o.a.c.math4.distribution" package),
>
> I agree this doesn't belong to commons-rng, but I'm not convinced it
> would fit a commons-rng-tools component. Maybe a component more
> targeted
> toward statistic algorithms?

Sampling and statistics do not necessarily belong together.
[This was a discussion in CM.]

But sampling is a direct use of RNG.

>> * Shuffling algorithm (cf. Commons Math's "o.a.c.m.MathArrays"),
>
> This should go in the ArrayUtils class of commons-lang, with a
> java.util.Random parameter.

I don't get that.
The idea is to parameterize the utilities with a
"UniformRandomProvider"
instance.

>> * Data generation (e.g. random strings, currently defined in
>>   Commons Math's "o.a.c.m.random.RandomUtils"),
>
> I'm not familiar with this, it looks linked to the distribution
> stuff.

No, it contains code that was previously in the "RandomDataGenerator"
class; that latter class indeed used to duplicate the sampling
utilities
in the CM "distribution" package.

It also contains functionality that was also developed in Commons Lang.
And generation of hexadecimal string, and picking randomly from a
Collection, etc.

>> * Syntactic sugar (e.g. strongly-type factory methods, as
>>   suggested by Emmanuel during the RC1 vote),
>
> Having two different factories in two different components for the
> same
> objects is odd. I prefer only one factory.

So be it; there is one.

I was trying to take care of your use-case.
I think that it is fine to wrap "my" factory into another
offered as syntactic sugar to allow an intelligent IDE
to do its job.

>> * Bridge/wrappers (as suggested by Emmanuel in RNG-19, on JIRA).
>
> A mere random generator should be in commons-rng.

Wrapping "/dev/random" is out of scope: it is not a mere random
generator.
Scope is: algorithms implemented in Java, in Commons RNG, all
providing the same basic functionality (like save/restore).

> And the
> java.util.Random bridge also belongs to commons-rng in my opinion,
> that's similar to the methods converting from/to java.util.Properties
> in
> commons-configuration.

As said, the view is not clear; this is not core functionality
(cf. above), it is convenience/utility.

We have to ask questions and provide convincing answers _before_
adding potentially useless code.

What are use-cases for "RandomSource.JDK"?


Gilles

> Emmanuel Bourg
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [RNG] Scope of "Commons RNG"

Gilles Sadowski
In reply to this post by Brent Worden-2
On Mon, 26 Sep 2016 16:10:12 -0500, Brent Worden wrote:
> I would keep the JDK source.  My reasoning being:
>
> 1. Users that want to use java.util.Random would not be able to use
> some or
> all of the RNG Utils code as the later will probably relay on
> RandomSource
> instances.

I don't understand the above.
Could you provide an example of what should be, but won't
be possible?

> 2. With LCGs the current Random implementation provided by Oracle
> could
> possibly be emulated by commons-rng.  However, there is no guarantee
> the
> current implementation is used by all JDKs.

There is; the generator definition is part of the API.

> Also, there a no guarantee the
> Oracle implement doesn't change in future versions of their JDK

I think there is for the same reason as above.

> and change
> to something that can not be emulated by commons-rng.

"RandomSource.JDK" does _not_ emulate "java.util.Random".
The only guarantee is that both provide the same sequences of
32-bits integers.
I.e. no such guarantee for any of the other "nextXxx" methods!
[See the docs.]

Regards,
Gilles

> So, to give users more flexibility in choosing a RandomSource and to
> be
> resilient to Random change, I feel the JDK source is beneficial.
>
> Brent
>
> On Mon, Sep 26, 2016 at 11:33 AM, Gilles
> <[hidden email]>
> wrote:
>
>> Hi.
>>
>> Reviving this thread following a new feature request:
>>   https://issues.apache.org/jira/browse/RNG-19
>>
>> IMHO, the request departs from the initial goal (and, hence
>> the design "requirements" on which the current code is based).
>>
>> As I suggested previously on this list, I'm going to request
>> a new "git" repository for implementing utilities based on
>> random generators.
>>
>> First candidates are:
>> * Non-uniform deviates (i.e. the samplers now defined in
>>   Commons Math's "o.a.c.math4.distribution" package),
>> * Shuffling algorithm (cf. Commons Math's "o.a.c.m.MathArrays"),
>> * Data generation (e.g. random strings, currently defined in
>>   Commons Math's "o.a.c.m.random.RandomUtils"),
>> * Syntactic sugar (e.g. strongly-type factory methods, as
>>   suggested by Emmanuel during the RC1 vote),
>> * Bridge/wrappers (as suggested by Emmanuel in RNG-19, on JIRA).
>>
>> Thus, "RNG Utils" would have a much less tightened scope, allowing
>> for experimenting with user-requested codes.[1][2]
>>
>> Independently, I'm also wondering about removing the "JDK" element
>> from the "o.a.c.rng.RandomSource" enum.
>> Rationale is that this RNG should not be used.[3]
>> Once the LCG family of generators is available[4], the algorithm
>> provided by the JDK can be emulated.[5]
>>
>> WDYT?
>>
>>
>> Regards,
>> Gilles
>>
>> [1] Thus avoiding any impact on the stability of "Commons RNG" as a
>>     simple, no-dependency, repository of PRNG algorithms ported to
>>     Java (and usable as such).
>> [2] I'd also suggest to copy/move to that new component the related
>>     utilities currently defined in Commons Lang.
>> [3] Users that _want_ to use "java.util.Random" for some reason will
>>     probably be better off using it directly.
>> [4] https://issues.apache.org/jira/browse/RNG-16
>> [5] To be confirmed by a unit test...
>>
>>
>> On Wed, 21 Sep 2016 17:27:35 +0200, Emmanuel Bourg wrote:
>>
>>> Le 21/09/2016 à 14:46, Gilles a écrit :
>>>
>>> If we want "Commons RNG" to be a repository of all
>>>> generators that exist out there, irrespective of their
>>>> known weaknesses, it's fine; but we should be careful to
>>>> not let casual users just pick one of the implementations
>>>> on the premise that the library focuses on high quality
>>>> generators.
>>>>
>>>
>>> I think it's fine to have weaker implementations as long as they
>>> are
>>> properly documented with the necessary warnings. There aren't that
>>> many
>>> algorithms anyway, we'll quickly have the interesting ones.
>>>
>>>
>>> I have no issue with adding any new implementation,[4]
>>>> on the conditions that it comes with
>>>>  1. a unit test where the output (say, a few hundred
>>>>     numbers) of "Commons RNG" is compared against a
>>>>     "reference" implementation,[5]
>>>>  2. the outputs of the "RandomStressTester"[6] piping
>>>>     from the "Dieharder" and "TU01/BigCrush" actual
>>>>     stress test suites.[7]
>>>>
>>>
>>> Sounds fair
>>>
>>>
>>> [1] Emmanuel, if you don't mind, we'd thus set the JIRA
>>>>     issue "type" to "wish" rather than "improvement".
>>>>
>>>
>>> As you want, that doesn't make a big difference. It could even
>>> qualify
>>> for the "New Feature" type.
>>>
>>> [2] https://xkcd.com/221/
>>>>
>>>
>>> Now I'm tempted to implement a XKCDRandomGenerator just for fun :)
>>>
>>> [3] Up to now, I had assumed that no known-to-be-bad
>>>>     generators would be part of "Commons RNG" (except
>>>>     "JDK", for reference purposes).
>>>>
>>>
>>> Note that as time goes some generators will be supplanted by better
>>> ones, so Commons RNG will inevitably contain implementations weaker
>>> than
>>> the then current state of the art.
>>>
>>> [4] It is not a problem to wait another couple of weeks
>>>>     for the additional code, before releasing 1.0.
>>>>
>>>
>>> Ok, I can try implementing LCGs then.
>>>
>>> Emmanuel Bourg
>>>
>>>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [RNG] Scope of "Commons RNG"

Brent Worden-2
In reply to this post by Gilles Sadowski
> First candidates are:
>>> * Non-uniform deviates (i.e. the samplers now defined in
>>>   Commons Math's "o.a.c.math4.distribution" package),
>>>
>>
>> I agree this doesn't belong to commons-rng, but I'm not convinced it
>> would fit a commons-rng-tools component. Maybe a component more targeted
>> toward statistic algorithms?
>>
>
> Sampling and statistics do not necessarily belong together.
> [This was a discussion in CM.]
>
>
+1

Having used commons-math to generate random deviates, the commons-math
approach of coupling random deviate generation to the distributions
themselves proved to be a bad development experience as well as not the
most performant in the form of object instantiations.  Specialized
random variate generators would be a lot better design IMO.
Reply | Threaded
Open this post in threaded view
|

Re: [RNG] Scope of "Commons RNG"

Brent Worden-2
In reply to this post by Gilles Sadowski
On Mon, Sep 26, 2016 at 6:21 PM, Gilles <[hidden email]>
wrote:

> On Mon, 26 Sep 2016 16:10:12 -0500, Brent Worden wrote:
>
>> I would keep the JDK source.  My reasoning being:
>>
>> 1. Users that want to use java.util.Random would not be able to use some
>> or
>> all of the RNG Utils code as the later will probably relay on RandomSource
>> instances.
>>
>
> I don't understand the above.
> Could you provide an example of what should be, but won't
> be possible?
>
> I misspoke when I stated RandomSource.  The RNG Utils code will probably
rely on UniformRandomProvider as its engine for random numbers.  So, if I
wanted to use java.util.Random as the engine I was thinking to use the JDK
random source to create the equivalent UniformRandomProvider.  I believe
what you are suggesting is if I wanted OOTB java.util.Random behavior I
would create the equivalent LCG based UniformRandomProvider (once said
functionality is available in commons-rng).  I guess that is acceptable if
such a UniformRandomProvider creation is in the form of a convenience,
factory method so I am not required to know the correct LCG parameters to
use.

2. With LCGs the current Random implementation provided by Oracle could
>> possibly be emulated by commons-rng.  However, there is no guarantee the
>> current implementation is used by all JDKs.
>>
>
> There is; the generator definition is part of the API.
>
>
And APIs are subject to change, albeit not very likely with the OOTB
implementation of java.util.Random.  So, if commons-rng is going to provide
a UniformRandomProvider that behaves like java.util.Random, we are
accepting the maintenance of said UniformRandomProvider should the
java.util.Random implementation ever change.  Since, the likelihood of
java.util.Random ever changing is remote, using an equivalent LCG based
UniformRandomProvider is probably acceptable as well since the probably
maintenance cost is minute.

With that said, I started thinking a bridge to go between the two engines,
UniformRandomProvider and java.util.Random, might be beneficial.  For third
parties that have implemented java.util.Random subclasses, it would be nice
to provide the means to easily adapt their Random implementation to a
UniformRandomProvider so it can be used in commons-rng related code.
Conversely, for third parties that use java.util.Random instances, it would
be nice to easily adapt a UniformRandomProvider to a Random so the
commons-rng generators could be used.
Reply | Threaded
Open this post in threaded view
|

Re: [RNG] Scope of "Commons RNG"

Gilles Sadowski
In reply to this post by Brent Worden-2
On Mon, 26 Sep 2016 21:22:35 -0500, Brent Worden wrote:

>> First candidates are:
>>>> * Non-uniform deviates (i.e. the samplers now defined in
>>>>   Commons Math's "o.a.c.math4.distribution" package),
>>>>
>>>
>>> I agree this doesn't belong to commons-rng, but I'm not convinced
>>> it
>>> would fit a commons-rng-tools component. Maybe a component more
>>> targeted
>>> toward statistic algorithms?
>>>
>>
>> Sampling and statistics do not necessarily belong together.
>> [This was a discussion in CM.]
>>
>>
> +1
>
> Having used commons-math to generate random deviates, the
> commons-math
> approach of coupling random deviate generation to the distributions
> themselves proved to be a bad development experience as well as not
> the
> most performant in the form of object instantiations.

Are you referring to the code in the 3.x line (branch "MATH_3_X"), or
in
the development branch (branch "master"[1]).

In unreleased CM4, the sampling is still coupled with the distribution
class but a RNG is passed only when a sampler is created.

> Specialized
> random variate generators would be a lot better design IMO.

If what's in CM4 is still not satisfactory, could you provide more
details on a better design?

Regards,
Gilles

[1] "Current" development branch was in branch "develop" until last
week.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [RNG] Scope of "Commons RNG"

Gilles Sadowski
In reply to this post by Brent Worden-2
On Mon, 26 Sep 2016 21:23:24 -0500, Brent Worden wrote:

> On Mon, Sep 26, 2016 at 6:21 PM, Gilles
> <[hidden email]>
> wrote:
>
>> On Mon, 26 Sep 2016 16:10:12 -0500, Brent Worden wrote:
>>
>>> I would keep the JDK source.  My reasoning being:
>>>
>>> 1. Users that want to use java.util.Random would not be able to use
>>> some
>>> or
>>> all of the RNG Utils code as the later will probably relay on
>>> RandomSource
>>> instances.
>>>
>>
>> I don't understand the above.
>> Could you provide an example of what should be, but won't
>> be possible?
>>
> I misspoke when I stated RandomSource.  The RNG Utils code will
> probably
> rely on UniformRandomProvider as its engine for random numbers.  So,
> if I
> wanted to use java.util.Random as the engine I was thinking to use
> the JDK
> random source to create the equivalent UniformRandomProvider.  I
> believe
> what you are suggesting is if I wanted OOTB java.util.Random behavior
> I
> would create the equivalent LCG based UniformRandomProvider (once
> said
> functionality is available in commons-rng).  I guess that is
> acceptable if
> such a UniformRandomProvider creation is in the form of a
> convenience,
> factory method so I am not required to know the correct LCG
> parameters to
> use.

If the need is to generate the same sequence of integers as would be
generated by "Random", then, indeed an LCG with the appropriate
settings
(multiplier, increment, modulus) should fill the bill.
A custom LCG would be instantiated with a call like
   rng = RandomSource.create(RandomSource.LCG_SELECT, seed, multiplier,
increment, modulus);
and convenience shortcut would be provided as
   rng = RandomSource.create(RandomSource.LCG_JDK, seed);

> 2. With LCGs the current Random implementation provided by Oracle
> could
>>> possibly be emulated by commons-rng.  However, there is no
>>> guarantee the
>>> current implementation is used by all JDKs.
>>>
>>
>> There is; the generator definition is part of the API.
>>
>>
> And APIs are subject to change, albeit not very likely with the OOTB
> implementation of java.util.Random.  So, if commons-rng is going to
> provide
> a UniformRandomProvider that behaves like java.util.Random, we are
> accepting the maintenance of said UniformRandomProvider should the
> java.util.Random implementation ever change.  Since, the likelihood
> of
> java.util.Random ever changing is remote, using an equivalent LCG
> based
> UniformRandomProvider is probably acceptable as well since the
> probably
> maintenance cost is minute.

If the implementation of "Random" changes, and the need is to produce
the same sequence (of 32-bits integers) as "Random", then using a
"fixed"
Commons RNG algo won't work since Commons RNG and JDK won't agree
either
for older or for newer Java versions.

If it is possible that the JDK API changes in that way, then that would
be a reason to keep "RandomSource.JDK".

But if the JDK API changes in such a way, it would also break existing
applications that rely on the definition of the RNG used in "Random".
I'd have supposed that the development policy of the JDK would not
allow it.

> With that said, I started thinking a bridge to go between the two
> engines,
> UniformRandomProvider and java.util.Random, might be beneficial.  For
> third
> parties that have implemented java.util.Random subclasses, it would
> be nice
> to provide the means to easily adapt their Random implementation to a
> UniformRandomProvider so it can be used in commons-rng related code.

Like method "asUniformRandomProvider" in
"o.a.c.math4.random.RandomUtils"
("master" branch)?

> Conversely, for third parties that use java.util.Random instances, it
> would
> be nice to easily adapt a UniformRandomProvider to a Random so the
> commons-rng generators could be used.

That would be possible (but the bridge must be from "RandomSource",
not "UniformRandomProvider"): see class
"o.a.c.math4.random.RngAdaptor".

Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [lang] Shuffling arrays (was: [RNG] Scope of "Commons RNG")

Emmanuel Bourg-3
In reply to this post by Gilles Sadowski
Le 27/09/2016 à 01:14, Gilles a écrit :

>>> * Shuffling algorithm (cf. Commons Math's "o.a.c.m.MathArrays"),
>>
>> This should go in the ArrayUtils class of commons-lang, with a
>> java.util.Random parameter.
>
> I don't get that.
> The idea is to parameterize the utilities with a "UniformRandomProvider"
> instance.

My suggestion is to add two methods to ArrayUtils in commons-lang for
each primitive type and Object (and maybe a couple more if we want to
shuffle only a subset of the array):

   ArraysUtils.shuffle(Object[] array)
   ArraysUtils.shuffle(Object[] array, java.util.Random rnd)

And if we want to shuffle with a random generator from commons-rng, we
simply convert the UniformRandomProvider into a java.util.Random using
the adapter:

   RandomProvider rng = RandomSource.create(...);
   ArraysUtils.shuffle(array, new JDKRandomAdapter(rng));

or

   RandomProvider rng = RandomSource.create(...);
   ArraysUtils.shuffle(array, rng.asRandom());

Emmanuel Bourg


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [lang] Shuffling arrays (was: [RNG] Scope of "Commons RNG")

Gilles Sadowski
Hi.

On Tue, 27 Sep 2016 12:53:33 +0200, Emmanuel Bourg wrote:

> Le 27/09/2016 à 01:14, Gilles a écrit :
>
>>>> * Shuffling algorithm (cf. Commons Math's "o.a.c.m.MathArrays"),
>>>
>>> This should go in the ArrayUtils class of commons-lang, with a
>>> java.util.Random parameter.
>>
>> I don't get that.
>> The idea is to parameterize the utilities with a
>> "UniformRandomProvider"
>> instance.
>
> My suggestion is to add two methods to ArrayUtils in commons-lang for
> each primitive type and Object (and maybe a couple more if we want to
> shuffle only a subset of the array):
>
>    ArraysUtils.shuffle(Object[] array)
>    ArraysUtils.shuffle(Object[] array, java.util.Random rnd)

I (strongly) suggest

   ArraysUtils.shuffle(Object[] array, o.a.c.rng.UniformRandomProvider
rnd)

>
> And if we want to shuffle with a random generator from commons-rng,
> we
> simply convert the UniformRandomProvider into a java.util.Random
> using
> the adapter:
>
>    RandomProvider rng = RandomSource.create(...);
>    ArraysUtils.shuffle(array, new JDKRandomAdapter(rng));
>
> or
>
>    RandomProvider rng = RandomSource.create(...);
>    ArraysUtils.shuffle(array, rng.asRandom());

Similarly, we'd rather overload "shuffle" as follows

   ArraysUtils.shuffle(Object[] array, java.util.Random rnd) {
     shuffle(array, RandomUtils.asUniformRandomProvider(rnd));
   }

where "RandomUtils" is currently in CM (package "o.a.c.math4.random").

It is not a matter of taste (cf. caveat in
"o.a.c.math4.random.RngAdpator").
The factory method "asUniformRandomProvider" creates an instance that
redirects all the interface methods to their counterpart in "Random"
(when
they exist): sequence of any type is the same, whether the Random
instance
is wrapped or not.  "RngAdapter" however creates a "Random" instance
where
only 32-bits integers sequences are preserved.

Moreover, the default RNG should be a good one, i.e. not
"java.util.Random".

But overall it would be much better to put all this in a new component
and deprecate all of CL's "Random"-parameterized methods.
It was noted (not only by me) that CL grew too big (and out of its
original
scope).  "RandomUtils" is relatively small (in Lang 3.4): now is a good
opportunity to deprecate these few methods and those intended for 3.5
and redirect users to a dedicated component.


Gilles

> Emmanuel Bourg
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [lang] Shuffling arrays

Emmanuel Bourg-3
Le 27/09/2016 à 13:22, Gilles a écrit :

> I (strongly) suggest
>
>   ArraysUtils.shuffle(Object[] array, o.a.c.rng.UniformRandomProvider rnd)

That's not possible, because we don't want to add external dependencies
to commons-lang.


> Moreover, the default RNG should be a good one, i.e. not
> "java.util.Random".

Using java.util.Random by default is good enough, and it's consistent
with Collections.shuffle():

http://docs.oracle.com/javase/8/docs/api/java/util/Collections.html#shuffle-java.util.List-

Emmanuel Bourg


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

1234