[Math] How fast is fast enough?

[Math] How fast is fast enough?

 Hi. Here is a micro-benchmark report (performed with "PerfTestUtils"): ----- nextInt() (calls per timed block: 2000000, timed blocks: 100, time unit: ms)                          name time/call std dev total time ratio   cv difference o.a.c.m.r.JDKRandomGenerator 1.088e-05 2.8e-06 2.1761e+03 1.000 0.26 0.0000e+00     o.a.c.m.r.MersenneTwister 1.024e-05 1.5e-06 2.0471e+03 0.941 0.15 -1.2900e+02            o.a.c.m.r.Well512a 1.193e-05 4.4e-07 2.3864e+03 1.097 0.04 2.1032e+02           o.a.c.m.r.Well1024a 1.348e-05 1.9e-06 2.6955e+03 1.239 0.14 5.1945e+02          o.a.c.m.r.Well19937a 1.495e-05 2.1e-06 2.9906e+03 1.374 0.14 8.1451e+02          o.a.c.m.r.Well19937c 1.577e-05 8.8e-07 3.1542e+03 1.450 0.06 9.7816e+02          o.a.c.m.r.Well44497a 1.918e-05 1.4e-06 3.8363e+03 1.763 0.08 1.6602e+03          o.a.c.m.r.Well44497b 1.953e-05 2.8e-06 3.9062e+03 1.795 0.14 1.7301e+03         o.a.c.m.r.ISAACRandom 1.169e-05 1.9e-06 2.3375e+03 1.074 0.16 1.6139e+02 ----- where "cv" is the ratio of the 3rd to the 2nd column. Questions are: * How meaningful are micro-benchmarks when the timed operation has a very    small duration (wrt e.g. the duration of other machine instructions that    are required to perform them)? * In a given environment (HW, OS, JVM), is there a lower limit (absolute    duration) below which anything will be deemed good enough? * Can a library like CM admit a trade-off between ultimate performance and    good design?    IOW, is there an acceptable overhead in exchange for other qualities    (clarity, non-redundancy, extensibility, etc.)? * Does ultimate performance for the base functionality (generation of a    random number) trump any consideration of use-cases that would need an    extension (of the base functionality, such as computation to match another    distribution) that will unavoidably degrades the performance (hence the    micro-benchmark will be completely misleading for those users)? * What are usages of the CM RNGs?    Do those use-cases strictly forbid "loosing" a dozen milliseconds per    million calls?    IOW, would those users for which such a difference matters use CM at all? Thanks, Gilles --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email]
Re: [Math] How fast is fast enough?

 On 2/4/16 3:59 PM, Gilles wrote: > Hi. > > Here is a micro-benchmark report (performed with "PerfTestUtils"): > ----- > nextInt() (calls per timed block: 2000000, timed blocks: 100, time > unit: ms) >                         name time/call std dev total time ratio   > cv difference > o.a.c.m.r.JDKRandomGenerator 1.088e-05 2.8e-06 2.1761e+03 1.000 > 0.26 0.0000e+00 >    o.a.c.m.r.MersenneTwister 1.024e-05 1.5e-06 2.0471e+03 0.941 > 0.15 -1.2900e+02 >           o.a.c.m.r.Well512a 1.193e-05 4.4e-07 2.3864e+03 1.097 > 0.04 2.1032e+02 >          o.a.c.m.r.Well1024a 1.348e-05 1.9e-06 2.6955e+03 1.239 > 0.14 5.1945e+02 >         o.a.c.m.r.Well19937a 1.495e-05 2.1e-06 2.9906e+03 1.374 > 0.14 8.1451e+02 >         o.a.c.m.r.Well19937c 1.577e-05 8.8e-07 3.1542e+03 1.450 > 0.06 9.7816e+02 >         o.a.c.m.r.Well44497a 1.918e-05 1.4e-06 3.8363e+03 1.763 > 0.08 1.6602e+03 >         o.a.c.m.r.Well44497b 1.953e-05 2.8e-06 3.9062e+03 1.795 > 0.14 1.7301e+03 >        o.a.c.m.r.ISAACRandom 1.169e-05 1.9e-06 2.3375e+03 1.074 > 0.16 1.6139e+02 > ----- > where "cv" is the ratio of the 3rd to the 2nd column. > > Questions are: > * How meaningful are micro-benchmarks when the timed operation has > a very >   small duration (wrt e.g. the duration of other machine > instructions that >   are required to perform them)? It is harder to get good benchmarks for shorter duration activities, but not impossible.  One thing that it would be good to do is to compare these results with JMH [1]. > * In a given environment (HW, OS, JVM), is there a lower limit > (absolute >   duration) below which anything will be deemed good enough? That depends completely on the application. > * Can a library like CM admit a trade-off between ultimate > performance and >   good design?   IOW, is there an acceptable overhead in exchange > for other qualities >   (clarity, non-redundancy, extensibility, etc.)? That is too general a question to be meaningful.   We need to look at specific cases.  What exactly are you proposing? > * Does ultimate performance for the base functionality (generation > of a >   random number) trump any consideration of use-cases that would > need an >   extension (of the base functionality, such as computation to > match another >   distribution) that will unavoidably degrades the performance > (hence the >   micro-benchmark will be completely misleading for those users)? Again, this is vague and the answer depends on what exactly you are talking about. Significantly damaging performance of PRNG implementations is a bad idea, unless there are actual practical use cases you can point to that whatever changes you are proposing enable.   > * What are usages of the CM RNGs? >   Do those use-cases strictly forbid "loosing" a dozen > milliseconds per >   million calls? There are many different use cases.  My own applications use them in simulations to generate random deviates, to generate random hex strings as identifiers and in stochastic algorithms like some of our internal uses.  The last case is definitely sensitive to PRNG performance. Phil [1] http://openjdk.java.net/projects/code-tools/jmh/>   IOW, would those users for which such a difference matters use > CM at all? > > > Thanks, > Gilles > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [hidden email] > For additional commands, e-mail: [hidden email] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email]
Re: [Math] How fast is fast enough?

Re: [Math] How fast is fast enough?

Re: [Math] How fast is fast enough?

Re: [Math] How fast is fast enough?

Re: [Math] How fast is fast enough?

Re: [Math] How fast is fast enough?

Re: [Math] How fast is fast enough?

Re: [Math] How fast is fast enough?

Re: [Math] How fast is fast enough?

Re: [Math] How fast is fast enough?

Re: [Math] How fast is fast enough?

Re: [Math] How fast is fast enough?

Re: [Math] How fast is fast enough?

Re: [Math] How fast is fast enough?

Re: [Math] How fast is fast enough?

Re: [Math] How fast is fast enough?

Re: [Math] How fast is fast enough?

 1. I don't understand the source of urgency here. If someone has a new algorithm they want to release to the general public, they can put it on github. It does not matter very much if a method is sitting in Apache (commons) Math on any particular schedule. It's not like adding a feature to some platform that has to be integrated to be useful. Of course, the 'Apache (commons) Math mark of quality' won't mean anything if there is a sudden shift to 'movement' as a guiding principle, so ironically the victory would be Pyrrhic. 2. JMH is the current gold standard of microbenchmark measurement. It gives meaningful results. If someone wants to claim that they have new code that has some particular performance on a micro scale, they should be eager to contribute a JMH benchmark; it's not much work. 3. Apache Math can only come into existence if the board is convinced that there is a community prepared to operated according to ASF principles. That means finding some way to collegially resolve disputes. There's always a trip to the incubator available. --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email]