# [statistics][descriptive] Classes or static methods for common descriptive statistics?

6 messages
Open this post in threaded view
|

## [statistics][descriptive] Classes or static methods for common descriptive statistics?

 The previous commons-math interface for descriptive statistics used a paradigm of constructing classes for various statistical functions and calling evaluate(). Example Mean mean = new Mean(); double mn = mean.evaluate(double[]) I wrote this type of code all through grad school and always found it unnecessarily bulky.  To me these summary statistics are classic use cases for static methods: double mean .= Mean.evaluate(double[]) I don't have any particular problem with the evaluate() syntax. I looked over the old Math 4 API to see if there were any benefits to the previous class-oriented approach that we might not want to lose. But I don't think there were, the functionality outside of evaluate() is minimal. Finally we should consider whether we really need a separate class for each statistic at all. Do we want to call: Mean.evaluate() or SummaryStats.mean() or maybe Stats.mean() ? The last being nice and compact. Let's make a decision so our esteemed mentee Virendra knows in what direction to take his work this summer. :)
Open this post in threaded view
|

## Re: [statistics][descriptive] Classes or static methods for common descriptive statistics?

 > On 28 May 2019, at 18:09, Eric Barnhill <[hidden email]> wrote: > > The previous commons-math interface for descriptive statistics used a > paradigm of constructing classes for various statistical functions and > calling evaluate(). Example > > Mean mean = new Mean(); > double mn = mean.evaluate(double[]) > > I wrote this type of code all through grad school and always found it > unnecessarily bulky.  To me these summary statistics are classic use cases > for static methods: > > double mean .= Mean.evaluate(double[]) > > I don't have any particular problem with the evaluate() syntax. > > I looked over the old Math 4 API to see if there were any benefits to the > previous class-oriented approach that we might not want to lose. But I > don't think there were, the functionality outside of evaluate() is minimal. A quick check shows that evaluate comes from UnivariateStatistic. This has some more methods that add little to an instance view of the computation: double evaluate(double[] values) throws MathIllegalArgumentException; double evaluate(double[] values, int begin, int length) throws MathIllegalArgumentException; UnivariateStatistic copy(); However it is extended by StorelessUnivariateStatistic which adds methods to update the statistic: void increment(double d); void incrementAll(double[] values) throws MathIllegalArgumentException; void incrementAll(double[] values, int start, int length) throws MathIllegalArgumentException; double getResult(); long getN(); void clear(); StorelessUnivariateStatistic copy(); This type of functionality would be lost by static methods. If you are moving to a functional interface type pattern for each statistic then you will lose the other functionality possible with an instance state, namely updating with more values or combining instances. So this is a question of whether updating a statistic is required after the first computation. Will there be an alternative in the library for a map-reduce type operation using instances that can be combined using Stream.collect:     R collect(Supplier supplier,                   ObjDoubleConsumer accumulator,                   BiConsumer combiner); Here would be Mean: double mean = Arrays.stream(new double[1000]).collect(Mean::new, Mean::add, Mean::add).getMean() with: void add(double); void add(Mean); double getMean(); (Untested code) > > Finally we should consider whether we really need a separate class for each > statistic at all. Do we want to call: > > Mean.evaluate() > > or > > SummaryStats.mean() > > or maybe > > Stats.mean() ? > > The last being nice and compact. > > Let's make a decision so our esteemed mentee Virendra knows in what > direction to take his work this summer. :) --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email]
Open this post in threaded view
|

## Re: [statistics][descriptive] Classes or static methods for common descriptive statistics?

Open this post in threaded view
|