[commons-statistics] STATISTICS-7 discussion

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[commons-statistics] STATISTICS-7 discussion

Eric Barnhill
Our ongoing discussion with potential mentees is being moved here as
suggested by Gilles.

Gilles commented on STATISTICS-7:
---------------------------------

current "math-linear" will be ported to "Commons Linear" in the future?


Perhaps; we'd need expert advice on how to design a modern implementation
of matrix algebra (?).

In the meantime, it may be worth exploring the implications of having a
very focused {{commons-numbers-matrix}} module in "Commons Numbers".

I also recommend checking out the EJML, which appears to be well
maintained, and probably has more expertise behind it than we would be able
to bring here. Like JTransforms its performance appears to be best in class
and it is appealingly encapsulated with no mission creep.



> just use the current library temporarily for now


I'd rather not, as it will perpetuate the impression that "Commons Math" is
still supported.  A new major version of CM should be released (with
"legacy" codes) that will depend on "Commons Statistics".

I agree, we do not want these libraries depending on commons-math.


 "math-util"


Anything in there that is still useful is a candidate for "Commons
Numbers".  Did you have a look at what's there already?


It is worth continuing the discussion about these Utils and utils-type
classes. They are often antipatterns that are falling between the stools of
object encapsulation and functional programming. MathUtils in particular
does nothing to describe the random functionalities in that class, all of
which probably have a better home somewhere else.

Someone else in our discussion mentioned MathArrays; most of this
functionality should be handled by streams now for example, and the current
algorithmic approach of most of MathArrays should be discouraged.
Reply | Threaded
Open this post in threaded view
|

RE: [commons-statistics] STATISTICS-7 discussion

Ben Nguyen
Hello,
With the regression library restructuring, am I correct to assume that a priority is to structure it such that appendage of new tools after the port of current linear regression (OLS, GLS, SimpleRegression) is as painless as possible?

I’ve seen this approach elsewhere and want to know what you think:
an approach which separates key regression features by implementing for e.g an Estimators and Residuals parent abstract/interface (others as needed) which is extended by for ex: OLSEstimators and OLSResiduals…. Then have a central handler ex: OLSRegression…. All of which are in the package regression-linear-ols? What do you think of this preliminary idea?
I would think that appending say the LogisticRegression (and other types) would be more straightforward as a result, having different regression types each having defined behavior and in separate packages with minimal dependencies as well of course.

Thank you
-Ben

From: Eric Barnhill
Sent: Monday, April 1, 2019 11:02 AM
To: Commons Developers List
Subject: [commons-statistics] STATISTICS-7 discussion

Our ongoing discussion with potential mentees is being moved here as
suggested by Gilles.

Gilles commented on STATISTICS-7:
---------------------------------

current "math-linear" will be ported to "Commons Linear" in the future?


Perhaps; we'd need expert advice on how to design a modern implementation
of matrix algebra (?).

In the meantime, it may be worth exploring the implications of having a
very focused {{commons-numbers-matrix}} module in "Commons Numbers".

I also recommend checking out the EJML, which appears to be well
maintained, and probably has more expertise behind it than we would be able
to bring here. Like JTransforms its performance appears to be best in class
and it is appealingly encapsulated with no mission creep.



> just use the current library temporarily for now


I'd rather not, as it will perpetuate the impression that "Commons Math" is
still supported.  A new major version of CM should be released (with
"legacy" codes) that will depend on "Commons Statistics".

I agree, we do not want these libraries depending on commons-math.


 "math-util"


Anything in there that is still useful is a candidate for "Commons
Numbers".  Did you have a look at what's there already?


It is worth continuing the discussion about these Utils and utils-type
classes. They are often antipatterns that are falling between the stools of
object encapsulation and functional programming. MathUtils in particular
does nothing to describe the random functionalities in that class, all of
which probably have a better home somewhere else.

Someone else in our discussion mentioned MathArrays; most of this
functionality should be handled by streams now for example, and the current
algorithmic approach of most of MathArrays should be discouraged.

Reply | Threaded
Open this post in threaded view
|

Re: [commons-statistics] STATISTICS-7 discussion

Eric Barnhill
Estimators and Residuals interfaces. I'd never thought of that. I like it!

I have read your draft proposal and I will make some comments over there,
shortly.



On Mon, Apr 1, 2019 at 5:33 PM Ben Nguyen <[hidden email]> wrote:

> Hello,
> With the regression library restructuring, am I correct to assume that a
> priority is to structure it such that appendage of new tools after the port
> of current linear regression (OLS, GLS, SimpleRegression) is as painless as
> possible?
>
> I’ve seen this approach elsewhere and want to know what you think:
> an approach which separates key regression features by implementing for
> e.g an Estimators and Residuals parent abstract/interface (others as
> needed) which is extended by for ex: OLSEstimators and OLSResiduals…. Then
> have a central handler ex: OLSRegression…. All of which are in the package
> regression-linear-ols? What do you think of this preliminary idea?
> I would think that appending say the LogisticRegression (and other types)
> would be more straightforward as a result, having different regression
> types each having defined behavior and in separate packages with minimal
> dependencies as well of course.
>
> Thank you
> -Ben
>
> From: Eric Barnhill
> Sent: Monday, April 1, 2019 11:02 AM
> To: Commons Developers List
> Subject: [commons-statistics] STATISTICS-7 discussion
>
> Our ongoing discussion with potential mentees is being moved here as
> suggested by Gilles.
>
> Gilles commented on STATISTICS-7:
> ---------------------------------
>
> current "math-linear" will be ported to "Commons Linear" in the future?
>
>
> Perhaps; we'd need expert advice on how to design a modern implementation
> of matrix algebra (?).
>
> In the meantime, it may be worth exploring the implications of having a
> very focused {{commons-numbers-matrix}} module in "Commons Numbers".
>
> I also recommend checking out the EJML, which appears to be well
> maintained, and probably has more expertise behind it than we would be able
> to bring here. Like JTransforms its performance appears to be best in class
> and it is appealingly encapsulated with no mission creep.
>
>
>
> > just use the current library temporarily for now
>
>
> I'd rather not, as it will perpetuate the impression that "Commons Math" is
> still supported.  A new major version of CM should be released (with
> "legacy" codes) that will depend on "Commons Statistics".
>
> I agree, we do not want these libraries depending on commons-math.
>
>
>  "math-util"
>
>
> Anything in there that is still useful is a candidate for "Commons
> Numbers".  Did you have a look at what's there already?
>
>
> It is worth continuing the discussion about these Utils and utils-type
> classes. They are often antipatterns that are falling between the stools of
> object encapsulation and functional programming. MathUtils in particular
> does nothing to describe the random functionalities in that class, all of
> which probably have a better home somewhere else.
>
> Someone else in our discussion mentioned MathArrays; most of this
> functionality should be handled by streams now for example, and the current
> algorithmic approach of most of MathArrays should be discouraged.
>
>
Reply | Threaded
Open this post in threaded view
|

RE: [commons-statistics] STATISTICS-7 discussion

Ben Nguyen
Hello Mr. Eric Barnhill
I have not submitted my draft proposal yet, you must’ve read someone else’s but I will submit mine later today or tomorrow with some more details about this approach idea.
Thanks,
-Ben

From: Eric Barnhill
Sent: Tuesday, April 2, 2019 3:18 PM
To: Commons Developers List
Subject: Re: [commons-statistics] STATISTICS-7 discussion

Estimators and Residuals interfaces. I'd never thought of that. I like it!

I have read your draft proposal and I will make some comments over there,
shortly.



On Mon, Apr 1, 2019 at 5:33 PM Ben Nguyen <[hidden email]> wrote:

> Hello,
> With the regression library restructuring, am I correct to assume that a
> priority is to structure it such that appendage of new tools after the port
> of current linear regression (OLS, GLS, SimpleRegression) is as painless as
> possible?
>
> I’ve seen this approach elsewhere and want to know what you think:
> an approach which separates key regression features by implementing for
> e.g an Estimators and Residuals parent abstract/interface (others as
> needed) which is extended by for ex: OLSEstimators and OLSResiduals…. Then
> have a central handler ex: OLSRegression…. All of which are in the package
> regression-linear-ols? What do you think of this preliminary idea?
> I would think that appending say the LogisticRegression (and other types)
> would be more straightforward as a result, having different regression
> types each having defined behavior and in separate packages with minimal
> dependencies as well of course.
>
> Thank you
> -Ben
>
> From: Eric Barnhill
> Sent: Monday, April 1, 2019 11:02 AM
> To: Commons Developers List
> Subject: [commons-statistics] STATISTICS-7 discussion
>
> Our ongoing discussion with potential mentees is being moved here as
> suggested by Gilles.
>
> Gilles commented on STATISTICS-7:
> ---------------------------------
>
> current "math-linear" will be ported to "Commons Linear" in the future?
>
>
> Perhaps; we'd need expert advice on how to design a modern implementation
> of matrix algebra (?).
>
> In the meantime, it may be worth exploring the implications of having a
> very focused {{commons-numbers-matrix}} module in "Commons Numbers".
>
> I also recommend checking out the EJML, which appears to be well
> maintained, and probably has more expertise behind it than we would be able
> to bring here. Like JTransforms its performance appears to be best in class
> and it is appealingly encapsulated with no mission creep.
>
>
>
> > just use the current library temporarily for now
>
>
> I'd rather not, as it will perpetuate the impression that "Commons Math" is
> still supported.  A new major version of CM should be released (with
> "legacy" codes) that will depend on "Commons Statistics".
>
> I agree, we do not want these libraries depending on commons-math.
>
>
>  "math-util"
>
>
> Anything in there that is still useful is a candidate for "Commons
> Numbers".  Did you have a look at what's there already?
>
>
> It is worth continuing the discussion about these Utils and utils-type
> classes. They are often antipatterns that are falling between the stools of
> object encapsulation and functional programming. MathUtils in particular
> does nothing to describe the random functionalities in that class, all of
> which probably have a better home somewhere else.
>
> Someone else in our discussion mentioned MathArrays; most of this
> functionality should be handled by streams now for example, and the current
> algorithmic approach of most of MathArrays should be discouraged.
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [commons-statistics] STATISTICS-7 discussion

Eric Barnhill
Sorry you are right I am reading Salman's. Looking forward to reading yours
as well.

On Tue, Apr 2, 2019 at 1:27 PM Ben Nguyen <[hidden email]> wrote:

> Hello Mr. Eric Barnhill
> I have not submitted my draft proposal yet, you must’ve read someone
> else’s but I will submit mine later today or tomorrow with some more
> details about this approach idea.
> Thanks,
> -Ben
>
> From: Eric Barnhill
> Sent: Tuesday, April 2, 2019 3:18 PM
> To: Commons Developers List
> Subject: Re: [commons-statistics] STATISTICS-7 discussion
>
> Estimators and Residuals interfaces. I'd never thought of that. I like it!
>
> I have read your draft proposal and I will make some comments over there,
> shortly.
>
>
>
> On Mon, Apr 1, 2019 at 5:33 PM Ben Nguyen <[hidden email]> wrote:
>
> > Hello,
> > With the regression library restructuring, am I correct to assume that a
> > priority is to structure it such that appendage of new tools after the
> port
> > of current linear regression (OLS, GLS, SimpleRegression) is as painless
> as
> > possible?
> >
> > I’ve seen this approach elsewhere and want to know what you think:
> > an approach which separates key regression features by implementing for
> > e.g an Estimators and Residuals parent abstract/interface (others as
> > needed) which is extended by for ex: OLSEstimators and OLSResiduals….
> Then
> > have a central handler ex: OLSRegression…. All of which are in the
> package
> > regression-linear-ols? What do you think of this preliminary idea?
> > I would think that appending say the LogisticRegression (and other types)
> > would be more straightforward as a result, having different regression
> > types each having defined behavior and in separate packages with minimal
> > dependencies as well of course.
> >
> > Thank you
> > -Ben
> >
> > From: Eric Barnhill
> > Sent: Monday, April 1, 2019 11:02 AM
> > To: Commons Developers List
> > Subject: [commons-statistics] STATISTICS-7 discussion
> >
> > Our ongoing discussion with potential mentees is being moved here as
> > suggested by Gilles.
> >
> > Gilles commented on STATISTICS-7:
> > ---------------------------------
> >
> > current "math-linear" will be ported to "Commons Linear" in the future?
> >
> >
> > Perhaps; we'd need expert advice on how to design a modern implementation
> > of matrix algebra (?).
> >
> > In the meantime, it may be worth exploring the implications of having a
> > very focused {{commons-numbers-matrix}} module in "Commons Numbers".
> >
> > I also recommend checking out the EJML, which appears to be well
> > maintained, and probably has more expertise behind it than we would be
> able
> > to bring here. Like JTransforms its performance appears to be best in
> class
> > and it is appealingly encapsulated with no mission creep.
> >
> >
> >
> > > just use the current library temporarily for now
> >
> >
> > I'd rather not, as it will perpetuate the impression that "Commons Math"
> is
> > still supported.  A new major version of CM should be released (with
> > "legacy" codes) that will depend on "Commons Statistics".
> >
> > I agree, we do not want these libraries depending on commons-math.
> >
> >
> >  "math-util"
> >
> >
> > Anything in there that is still useful is a candidate for "Commons
> > Numbers".  Did you have a look at what's there already?
> >
> >
> > It is worth continuing the discussion about these Utils and utils-type
> > classes. They are often antipatterns that are falling between the stools
> of
> > object encapsulation and functional programming. MathUtils in particular
> > does nothing to describe the random functionalities in that class, all of
> > which probably have a better home somewhere else.
> >
> > Someone else in our discussion mentioned MathArrays; most of this
> > functionality should be handled by streams now for example, and the
> current
> > algorithmic approach of most of MathArrays should be discouraged.
> >
> >
>
>