[Math] Commons Math (r)evolution

classic Classic list List threaded Threaded
49 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[Math] Commons Math (r)evolution

Gilles Sadowski
Hello.

Commons Math as it was in the last official release (v3.6.1) and
consequently as it is in the current development branch has
become unmaintainable.

This conclusion is unavoidable when looking at the following:
  1. codebase statistics (as of today):
     * src/main/java       90834 lines of Java code (882 files)
     * src/test/java       95595 lines of Java code (552 files)
     * src/userguide/java   3493 lines of Java code (19 files)
  2. number of posts on the "dev" ML (related to [Math]) in the
     last 2 months:
     * Gilles            74
     * Artem Barger      20
     * sebb              15
     * Rob Tompkins       9
     * Eric Barnhill      7
     * 19 other people   46
  3. development/maintenance activity in the last 4 months:
     * commits by Gilles  133
     * commits by others   12

According to JIRA, among 180 issues currently targeted for the
next major release (v4.0), 139 have been resolved (75 of which
were not in v3.6.1).

So, on the one hand, a lot of work has been done already, but
on the other, a lot remains.
Some issues have been pending for several years, in particular
those that required a major refactoring (e.g. in the packages
"o.a.c.m.linear" and "o.a.c.m.optim").

The remaining work is unlikely to be completed soon since the
fork of Commons Math has drastically reduced the development
capacity.

Moreover, as whole areas of the codebase have become in effect
unsupported, it would be deceiving to release a version 4.0 of
Commons Math that would include all of it.

Of course, anyone who wishes to maintain some of these codes
(answer user questions, fix bugs, create enhancements, etc.)
is most welcome to step forward.

But I'm not optimistic that the necessary level of support can
be achieved in the near future for all the codebase.
Waiting for that to happen would entail that code that could
be in a releasable state soon will be on hold for an indefinite
time.

The purpose of this post is to initiate a discussion about
splitting Commons Math, along the following lines:
1. Identify independent functionality that can be maintained
    even by a small (even a 1-person) team within Commons and
    make it a new component.
2. Identify functionality that cannot be maintained anymore
    inside Commons and try to reach out to users of this
    functionality, asking whether they would be willing to
    give a helping hand.
    If there is sufficient interest, and a new development
    team can form, that code would then be transferred to the
    Apache "incubator".

There are numerous advantages to having separate components
rather than a monolithic library:
  * Limited and well-defined scope
  * Immediate visibility of purpose
  * Lower barrier to entry
  * Independent development policy
  * Homogeneous codebase
  * Independent release schedule
  * Foster a new community around the component

According to the most recent development activity, the likely
candidates for becoming a new component are:
  * Pseudo-random numbers generators (package "o.a.c.m.rng")
  * Complex numbers (package "o.a.c.m.complex")
  * Clustering algorithms (package "o.a.c.m.ml.clustering")

Other potential candidates:
  * "FastMath" (which should be renamed to something like
    "AccurateMath", cf. reports about slowness of some of
    the functions)
  * Special functions (package "o.a.c.m.special")
  * Selected "utilities" (package "o.a.c.m.util")


Best regards,
Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] Commons Math (r)evolution

Artem Barger
On Mon, Jun 6, 2016 at 3:49 AM, Gilles <[hidden email]> wrote:

>
> According to JIRA, among 180 issues currently targeted for the
> next major release (v4.0), 139 have been resolved (75 of which
> were not in v3.6.1).
>

​Huh, it's above of 75% completion :)​


> So, on the one hand, a lot of work has been done already, but
> on the other, a lot remains.
> Some issues have been pending for several years, in particular
> those that required a major refactoring (e.g. in the packages
> "o.a.c.m.linear" and "o.a.c.m.optim").
>
> The remaining work is unlikely to be completed soon since the
> fork of Commons Math has drastically reduced the development
> capacity.
>
> Moreover, as whole areas of the codebase have become in effect
> unsupported, it would be deceiving to release a version 4.0 of
> Commons Math that would include all of it.
>
> Of course, anyone who wishes to maintain some of these codes
> (answer user questions, fix bugs, create enhancements, etc.)
> is most welcome to step forward.
>

​I can try to cover some of these and maintain relevant code parts.​



>
> But I'm not optimistic that the necessary level of support can
> be achieved in the near future for all the codebase.
> Waiting for that to happen would entail that code that could
> be in a releasable state soon will be on hold for an indefinite
> time.
>
> What exactly missing to provide reasonable support, apart of
course of people who left?​



> The purpose of this post is to initiate a discussion about
> splitting Commons Math, along the following lines:
> 1. Identify independent functionality that can be maintained
>    even by a small (even a 1-person) team within Commons and
>    make it a new component.
> 2. Identify functionality that cannot be maintained anymore
>    inside Commons and try to reach out to users of this
>    functionality, asking whether they would be willing to
>    give a helping hand.
>    If there is sufficient interest, and a new development
>    team can form, that code would then be transferred to the
>    Apache "incubator".
>
> According to the most recent development activity, the likely
> candidates for becoming a new component are:
>  * Pseudo-random numbers generators (package "o.a.c.m.rng")
>  * Complex numbers (package "o.a.c.m.complex")
>  * Clustering algorithms (package "o.a.c.m.ml.clustering")
>
>
​I think that clustering part could be generalized to ML package as a
whole.​

​Best regrads,
            Artem Barger.​
Reply | Threaded
Open this post in threaded view
|

Re: [Math] Commons Math (r)evolution

Eric Barnhill
In reply to this post by Gilles Sadowski
I am not a mathematician so I would not be able to play a particularly
catholic role in commons-math. But, I am always delighted when my research
needs allow me to spin off contributions into the code base.

I work with complex valued 3 to 6-dimensional image volumes. So I am happy
to maintain code involving complex numbers first of all, as well as
investigate their integration their integration with Octave and ImgLib.

I am also interested in code for array-based math operations which is
overwhelmingly how I compute. I would be happy to maintain that code and it
does seem that now and again, suggests for how to refactor it come through
JIRA. I have my own home brewed libraries for syntactically convenient
array wise operations that may also be of interest once everyone is happy
with the current state of the code base.

Eric



On Mon, Jun 6, 2016 at 2:49 AM, Gilles <[hidden email]> wrote:

> Hello.
>
> Commons Math as it was in the last official release (v3.6.1) and
> consequently as it is in the current development branch has
> become unmaintainable.
>
> This conclusion is unavoidable when looking at the following:
>  1. codebase statistics (as of today):
>     * src/main/java       90834 lines of Java code (882 files)
>     * src/test/java       95595 lines of Java code (552 files)
>     * src/userguide/java   3493 lines of Java code (19 files)
>  2. number of posts on the "dev" ML (related to [Math]) in the
>     last 2 months:
>     * Gilles            74
>     * Artem Barger      20
>     * sebb              15
>     * Rob Tompkins       9
>     * Eric Barnhill      7
>     * 19 other people   46
>  3. development/maintenance activity in the last 4 months:
>     * commits by Gilles  133
>     * commits by others   12
>
> According to JIRA, among 180 issues currently targeted for the
> next major release (v4.0), 139 have been resolved (75 of which
> were not in v3.6.1).
>
> So, on the one hand, a lot of work has been done already, but
> on the other, a lot remains.
> Some issues have been pending for several years, in particular
> those that required a major refactoring (e.g. in the packages
> "o.a.c.m.linear" and "o.a.c.m.optim").
>
> The remaining work is unlikely to be completed soon since the
> fork of Commons Math has drastically reduced the development
> capacity.
>
> Moreover, as whole areas of the codebase have become in effect
> unsupported, it would be deceiving to release a version 4.0 of
> Commons Math that would include all of it.
>
> Of course, anyone who wishes to maintain some of these codes
> (answer user questions, fix bugs, create enhancements, etc.)
> is most welcome to step forward.
>
> But I'm not optimistic that the necessary level of support can
> be achieved in the near future for all the codebase.
> Waiting for that to happen would entail that code that could
> be in a releasable state soon will be on hold for an indefinite
> time.
>
> The purpose of this post is to initiate a discussion about
> splitting Commons Math, along the following lines:
> 1. Identify independent functionality that can be maintained
>    even by a small (even a 1-person) team within Commons and
>    make it a new component.
> 2. Identify functionality that cannot be maintained anymore
>    inside Commons and try to reach out to users of this
>    functionality, asking whether they would be willing to
>    give a helping hand.
>    If there is sufficient interest, and a new development
>    team can form, that code would then be transferred to the
>    Apache "incubator".
>
> There are numerous advantages to having separate components
> rather than a monolithic library:
>  * Limited and well-defined scope
>  * Immediate visibility of purpose
>  * Lower barrier to entry
>  * Independent development policy
>  * Homogeneous codebase
>  * Independent release schedule
>  * Foster a new community around the component
>
> According to the most recent development activity, the likely
> candidates for becoming a new component are:
>  * Pseudo-random numbers generators (package "o.a.c.m.rng")
>  * Complex numbers (package "o.a.c.m.complex")
>  * Clustering algorithms (package "o.a.c.m.ml.clustering")
>
> Other potential candidates:
>  * "FastMath" (which should be renamed to something like
>    "AccurateMath", cf. reports about slowness of some of
>    the functions)
>  * Special functions (package "o.a.c.m.special")
>  * Selected "utilities" (package "o.a.c.m.util")
>
>
> Best regards,
> Gilles
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [Math] Commons Math (r)evolution

Benedikt Ritter-4
In reply to this post by Gilles Sadowski
Hello Gilles,

I think ApacheCon Europe would be a good opportunity to spread the word
about this.

Benedikt

Gilles <[hidden email]> schrieb am Mo., 6. Juni 2016 um
02:49 Uhr:

> Hello.
>
> Commons Math as it was in the last official release (v3.6.1) and
> consequently as it is in the current development branch has
> become unmaintainable.
>
> This conclusion is unavoidable when looking at the following:
>   1. codebase statistics (as of today):
>      * src/main/java       90834 lines of Java code (882 files)
>      * src/test/java       95595 lines of Java code (552 files)
>      * src/userguide/java   3493 lines of Java code (19 files)
>   2. number of posts on the "dev" ML (related to [Math]) in the
>      last 2 months:
>      * Gilles            74
>      * Artem Barger      20
>      * sebb              15
>      * Rob Tompkins       9
>      * Eric Barnhill      7
>      * 19 other people   46
>   3. development/maintenance activity in the last 4 months:
>      * commits by Gilles  133
>      * commits by others   12
>
> According to JIRA, among 180 issues currently targeted for the
> next major release (v4.0), 139 have been resolved (75 of which
> were not in v3.6.1).
>
> So, on the one hand, a lot of work has been done already, but
> on the other, a lot remains.
> Some issues have been pending for several years, in particular
> those that required a major refactoring (e.g. in the packages
> "o.a.c.m.linear" and "o.a.c.m.optim").
>
> The remaining work is unlikely to be completed soon since the
> fork of Commons Math has drastically reduced the development
> capacity.
>
> Moreover, as whole areas of the codebase have become in effect
> unsupported, it would be deceiving to release a version 4.0 of
> Commons Math that would include all of it.
>
> Of course, anyone who wishes to maintain some of these codes
> (answer user questions, fix bugs, create enhancements, etc.)
> is most welcome to step forward.
>
> But I'm not optimistic that the necessary level of support can
> be achieved in the near future for all the codebase.
> Waiting for that to happen would entail that code that could
> be in a releasable state soon will be on hold for an indefinite
> time.
>
> The purpose of this post is to initiate a discussion about
> splitting Commons Math, along the following lines:
> 1. Identify independent functionality that can be maintained
>     even by a small (even a 1-person) team within Commons and
>     make it a new component.
> 2. Identify functionality that cannot be maintained anymore
>     inside Commons and try to reach out to users of this
>     functionality, asking whether they would be willing to
>     give a helping hand.
>     If there is sufficient interest, and a new development
>     team can form, that code would then be transferred to the
>     Apache "incubator".
>
> There are numerous advantages to having separate components
> rather than a monolithic library:
>   * Limited and well-defined scope
>   * Immediate visibility of purpose
>   * Lower barrier to entry
>   * Independent development policy
>   * Homogeneous codebase
>   * Independent release schedule
>   * Foster a new community around the component
>
> According to the most recent development activity, the likely
> candidates for becoming a new component are:
>   * Pseudo-random numbers generators (package "o.a.c.m.rng")
>   * Complex numbers (package "o.a.c.m.complex")
>   * Clustering algorithms (package "o.a.c.m.ml.clustering")
>
> Other potential candidates:
>   * "FastMath" (which should be renamed to something like
>     "AccurateMath", cf. reports about slowness of some of
>     the functions)
>   * Special functions (package "o.a.c.m.special")
>   * Selected "utilities" (package "o.a.c.m.util")
>
>
> Best regards,
> Gilles
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [Math] Commons Math (r)evolution

Ralph Goers
Although I am not involved in Math I find myself wondering if we shouldn’t just step back and take a breath before rushing into anything. It may be that the approach being recommended is the correct one, but it also may be that there are other people waiting in the wings that we are unaware of.

Ralph

> On Jun 6, 2016, at 10:57 AM, Benedikt Ritter <[hidden email]> wrote:
>
> Hello Gilles,
>
> I think ApacheCon Europe would be a good opportunity to spread the word
> about this.
>
> Benedikt
>
> Gilles <[hidden email]> schrieb am Mo., 6. Juni 2016 um
> 02:49 Uhr:
>
>> Hello.
>>
>> Commons Math as it was in the last official release (v3.6.1) and
>> consequently as it is in the current development branch has
>> become unmaintainable.
>>
>> This conclusion is unavoidable when looking at the following:
>>  1. codebase statistics (as of today):
>>     * src/main/java       90834 lines of Java code (882 files)
>>     * src/test/java       95595 lines of Java code (552 files)
>>     * src/userguide/java   3493 lines of Java code (19 files)
>>  2. number of posts on the "dev" ML (related to [Math]) in the
>>     last 2 months:
>>     * Gilles            74
>>     * Artem Barger      20
>>     * sebb              15
>>     * Rob Tompkins       9
>>     * Eric Barnhill      7
>>     * 19 other people   46
>>  3. development/maintenance activity in the last 4 months:
>>     * commits by Gilles  133
>>     * commits by others   12
>>
>> According to JIRA, among 180 issues currently targeted for the
>> next major release (v4.0), 139 have been resolved (75 of which
>> were not in v3.6.1).
>>
>> So, on the one hand, a lot of work has been done already, but
>> on the other, a lot remains.
>> Some issues have been pending for several years, in particular
>> those that required a major refactoring (e.g. in the packages
>> "o.a.c.m.linear" and "o.a.c.m.optim").
>>
>> The remaining work is unlikely to be completed soon since the
>> fork of Commons Math has drastically reduced the development
>> capacity.
>>
>> Moreover, as whole areas of the codebase have become in effect
>> unsupported, it would be deceiving to release a version 4.0 of
>> Commons Math that would include all of it.
>>
>> Of course, anyone who wishes to maintain some of these codes
>> (answer user questions, fix bugs, create enhancements, etc.)
>> is most welcome to step forward.
>>
>> But I'm not optimistic that the necessary level of support can
>> be achieved in the near future for all the codebase.
>> Waiting for that to happen would entail that code that could
>> be in a releasable state soon will be on hold for an indefinite
>> time.
>>
>> The purpose of this post is to initiate a discussion about
>> splitting Commons Math, along the following lines:
>> 1. Identify independent functionality that can be maintained
>>    even by a small (even a 1-person) team within Commons and
>>    make it a new component.
>> 2. Identify functionality that cannot be maintained anymore
>>    inside Commons and try to reach out to users of this
>>    functionality, asking whether they would be willing to
>>    give a helping hand.
>>    If there is sufficient interest, and a new development
>>    team can form, that code would then be transferred to the
>>    Apache "incubator".
>>
>> There are numerous advantages to having separate components
>> rather than a monolithic library:
>>  * Limited and well-defined scope
>>  * Immediate visibility of purpose
>>  * Lower barrier to entry
>>  * Independent development policy
>>  * Homogeneous codebase
>>  * Independent release schedule
>>  * Foster a new community around the component
>>
>> According to the most recent development activity, the likely
>> candidates for becoming a new component are:
>>  * Pseudo-random numbers generators (package "o.a.c.m.rng")
>>  * Complex numbers (package "o.a.c.m.complex")
>>  * Clustering algorithms (package "o.a.c.m.ml.clustering")
>>
>> Other potential candidates:
>>  * "FastMath" (which should be renamed to something like
>>    "AccurateMath", cf. reports about slowness of some of
>>    the functions)
>>  * Special functions (package "o.a.c.m.special")
>>  * Selected "utilities" (package "o.a.c.m.util")
>>
>>
>> Best regards,
>> Gilles
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] Commons Math (r)evolution

Gilles Sadowski
On Mon, 6 Jun 2016 11:39:49 -0700, Ralph Goers wrote:
> Although I am not involved in Math I find myself wondering if we
> shouldn’t just step back and take a breath before rushing into
> anything.

There isn't any rush, modularization (as many other things, like
e.g. to stop sticking to Java 5) has been in the pipe-line for
months (years maybe).

> It may be that the approach being recommended is the correct
> one, but it also may be that there are other people waiting in the
> wings that we are unaware of.

How do you suggest that we reach out to them?
[In the most recent debates about the development (e.g. what Java
version to support for the next major release), nobody showed up on
the "dev" ML outside the usual suspects (the regular committers and
one or two other people who had been participating in the discussions
then, but don't seem to be interested anymore).]

Moreover, I am willing to personally experiment with the RNG classes
on which I've been working a lot since last December.
That code is completely new (never released), so nobody can possibly
come up to complain that it should have stayed inside Commons Math.


Gilles

> Ralph
>
>> On Jun 6, 2016, at 10:57 AM, Benedikt Ritter <[hidden email]>
>> wrote:
>>
>> Hello Gilles,
>>
>> I think ApacheCon Europe would be a good opportunity to spread the
>> word
>> about this.
>>
>> Benedikt
>>
>> Gilles <[hidden email]> schrieb am Mo., 6. Juni 2016
>> um
>> 02:49 Uhr:
>>
>>> Hello.
>>>
>>> Commons Math as it was in the last official release (v3.6.1) and
>>> consequently as it is in the current development branch has
>>> become unmaintainable.
>>>
>>> This conclusion is unavoidable when looking at the following:
>>>  1. codebase statistics (as of today):
>>>     * src/main/java       90834 lines of Java code (882 files)
>>>     * src/test/java       95595 lines of Java code (552 files)
>>>     * src/userguide/java   3493 lines of Java code (19 files)
>>>  2. number of posts on the "dev" ML (related to [Math]) in the
>>>     last 2 months:
>>>     * Gilles            74
>>>     * Artem Barger      20
>>>     * sebb              15
>>>     * Rob Tompkins       9
>>>     * Eric Barnhill      7
>>>     * 19 other people   46
>>>  3. development/maintenance activity in the last 4 months:
>>>     * commits by Gilles  133
>>>     * commits by others   12
>>>
>>> According to JIRA, among 180 issues currently targeted for the
>>> next major release (v4.0), 139 have been resolved (75 of which
>>> were not in v3.6.1).
>>>
>>> So, on the one hand, a lot of work has been done already, but
>>> on the other, a lot remains.
>>> Some issues have been pending for several years, in particular
>>> those that required a major refactoring (e.g. in the packages
>>> "o.a.c.m.linear" and "o.a.c.m.optim").
>>>
>>> The remaining work is unlikely to be completed soon since the
>>> fork of Commons Math has drastically reduced the development
>>> capacity.
>>>
>>> Moreover, as whole areas of the codebase have become in effect
>>> unsupported, it would be deceiving to release a version 4.0 of
>>> Commons Math that would include all of it.
>>>
>>> Of course, anyone who wishes to maintain some of these codes
>>> (answer user questions, fix bugs, create enhancements, etc.)
>>> is most welcome to step forward.
>>>
>>> But I'm not optimistic that the necessary level of support can
>>> be achieved in the near future for all the codebase.
>>> Waiting for that to happen would entail that code that could
>>> be in a releasable state soon will be on hold for an indefinite
>>> time.
>>>
>>> The purpose of this post is to initiate a discussion about
>>> splitting Commons Math, along the following lines:
>>> 1. Identify independent functionality that can be maintained
>>>    even by a small (even a 1-person) team within Commons and
>>>    make it a new component.
>>> 2. Identify functionality that cannot be maintained anymore
>>>    inside Commons and try to reach out to users of this
>>>    functionality, asking whether they would be willing to
>>>    give a helping hand.
>>>    If there is sufficient interest, and a new development
>>>    team can form, that code would then be transferred to the
>>>    Apache "incubator".
>>>
>>> There are numerous advantages to having separate components
>>> rather than a monolithic library:
>>>  * Limited and well-defined scope
>>>  * Immediate visibility of purpose
>>>  * Lower barrier to entry
>>>  * Independent development policy
>>>  * Homogeneous codebase
>>>  * Independent release schedule
>>>  * Foster a new community around the component
>>>
>>> According to the most recent development activity, the likely
>>> candidates for becoming a new component are:
>>>  * Pseudo-random numbers generators (package "o.a.c.m.rng")
>>>  * Complex numbers (package "o.a.c.m.complex")
>>>  * Clustering algorithms (package "o.a.c.m.ml.clustering")
>>>
>>> Other potential candidates:
>>>  * "FastMath" (which should be renamed to something like
>>>    "AccurateMath", cf. reports about slowness of some of
>>>    the functions)
>>>  * Special functions (package "o.a.c.m.special")
>>>  * Selected "utilities" (package "o.a.c.m.util")
>>>
>>>
>>> Best regards,
>>> Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] Commons Math (r)evolution

Gilles Sadowski
In reply to this post by Benedikt Ritter-4
On Mon, 06 Jun 2016 17:57:53 +0000, Benedikt Ritter wrote:
> Hello Gilles,
>
> I think ApacheCon Europe would be a good opportunity to spread the
> word
> about this.

I hope that by this time, if you want to say a few words about Commons
Math, you'll have more positive things to mention...  And by this I
mean
of course that we'll have  succeeded in rescuing already some of the
code and turned it into something readily useful

Gilles

>
> Benedikt
>
> Gilles <[hidden email]> schrieb am Mo., 6. Juni 2016 um
> 02:49 Uhr:
>
>> Hello.
>>
>> Commons Math as it was in the last official release (v3.6.1) and
>> consequently as it is in the current development branch has
>> become unmaintainable.
>>
>> This conclusion is unavoidable when looking at the following:
>>   1. codebase statistics (as of today):
>>      * src/main/java       90834 lines of Java code (882 files)
>>      * src/test/java       95595 lines of Java code (552 files)
>>      * src/userguide/java   3493 lines of Java code (19 files)
>>   2. number of posts on the "dev" ML (related to [Math]) in the
>>      last 2 months:
>>      * Gilles            74
>>      * Artem Barger      20
>>      * sebb              15
>>      * Rob Tompkins       9
>>      * Eric Barnhill      7
>>      * 19 other people   46
>>   3. development/maintenance activity in the last 4 months:
>>      * commits by Gilles  133
>>      * commits by others   12
>>
>> According to JIRA, among 180 issues currently targeted for the
>> next major release (v4.0), 139 have been resolved (75 of which
>> were not in v3.6.1).
>>
>> So, on the one hand, a lot of work has been done already, but
>> on the other, a lot remains.
>> Some issues have been pending for several years, in particular
>> those that required a major refactoring (e.g. in the packages
>> "o.a.c.m.linear" and "o.a.c.m.optim").
>>
>> The remaining work is unlikely to be completed soon since the
>> fork of Commons Math has drastically reduced the development
>> capacity.
>>
>> Moreover, as whole areas of the codebase have become in effect
>> unsupported, it would be deceiving to release a version 4.0 of
>> Commons Math that would include all of it.
>>
>> Of course, anyone who wishes to maintain some of these codes
>> (answer user questions, fix bugs, create enhancements, etc.)
>> is most welcome to step forward.
>>
>> But I'm not optimistic that the necessary level of support can
>> be achieved in the near future for all the codebase.
>> Waiting for that to happen would entail that code that could
>> be in a releasable state soon will be on hold for an indefinite
>> time.
>>
>> The purpose of this post is to initiate a discussion about
>> splitting Commons Math, along the following lines:
>> 1. Identify independent functionality that can be maintained
>>     even by a small (even a 1-person) team within Commons and
>>     make it a new component.
>> 2. Identify functionality that cannot be maintained anymore
>>     inside Commons and try to reach out to users of this
>>     functionality, asking whether they would be willing to
>>     give a helping hand.
>>     If there is sufficient interest, and a new development
>>     team can form, that code would then be transferred to the
>>     Apache "incubator".
>>
>> There are numerous advantages to having separate components
>> rather than a monolithic library:
>>   * Limited and well-defined scope
>>   * Immediate visibility of purpose
>>   * Lower barrier to entry
>>   * Independent development policy
>>   * Homogeneous codebase
>>   * Independent release schedule
>>   * Foster a new community around the component
>>
>> According to the most recent development activity, the likely
>> candidates for becoming a new component are:
>>   * Pseudo-random numbers generators (package "o.a.c.m.rng")
>>   * Complex numbers (package "o.a.c.m.complex")
>>   * Clustering algorithms (package "o.a.c.m.ml.clustering")
>>
>> Other potential candidates:
>>   * "FastMath" (which should be renamed to something like
>>     "AccurateMath", cf. reports about slowness of some of
>>     the functions)
>>   * Special functions (package "o.a.c.m.special")
>>   * Selected "utilities" (package "o.a.c.m.util")
>>
>>
>> Best regards,
>> Gilles
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] Commons Math (r)evolution

Gilles Sadowski
In reply to this post by Artem Barger
On Mon, 6 Jun 2016 10:10:17 +0300, Artem Barger wrote:

> On Mon, Jun 6, 2016 at 3:49 AM, Gilles <[hidden email]>
> wrote:
>
>>
>> According to JIRA, among 180 issues currently targeted for the
>> next major release (v4.0), 139 have been resolved (75 of which
>> were not in v3.6.1).
>>
>
> ​Huh, it's above of 75% completion :)​

Everybody is welcome to review the "open" issues and comment
about them.

Those about which there is no comment are likely to point to
unsupported code...

>> So, on the one hand, a lot of work has been done already, but
>> on the other, a lot remains.
>> Some issues have been pending for several years, in particular
>> those that required a major refactoring (e.g. in the packages
>> "o.a.c.m.linear" and "o.a.c.m.optim").
>>
>> The remaining work is unlikely to be completed soon since the
>> fork of Commons Math has drastically reduced the development
>> capacity.
>>
>> Moreover, as whole areas of the codebase have become in effect
>> unsupported, it would be deceiving to release a version 4.0 of
>> Commons Math that would include all of it.
>>
>> Of course, anyone who wishes to maintain some of these codes
>> (answer user questions, fix bugs, create enhancements, etc.)
>> is most welcome to step forward.
>>
>
> ​I can try to cover some of these and maintain relevant code parts.​

Which ones?

>>
>> But I'm not optimistic that the necessary level of support can
>> be achieved in the near future for all the codebase.
>> Waiting for that to happen would entail that code that could
>> be in a releasable state soon will be on hold for an indefinite
>> time.
>>
> What exactly missing to provide reasonable support, apart of
> course of people who left?​
>

IMO, a maintainer is someone who is able to respond to user
questions and to figure out whether a bug report is valid.

>> The purpose of this post is to initiate a discussion about
>> splitting Commons Math, along the following lines:
>> 1. Identify independent functionality that can be maintained
>>    even by a small (even a 1-person) team within Commons and
>>    make it a new component.
>> 2. Identify functionality that cannot be maintained anymore
>>    inside Commons and try to reach out to users of this
>>    functionality, asking whether they would be willing to
>>    give a helping hand.
>>    If there is sufficient interest, and a new development
>>    team can form, that code would then be transferred to the
>>    Apache "incubator".
>>
>> According to the most recent development activity, the likely
>> candidates for becoming a new component are:
>>  * Pseudo-random numbers generators (package "o.a.c.m.rng")
>>  * Complex numbers (package "o.a.c.m.complex")
>>  * Clustering algorithms (package "o.a.c.m.ml.clustering")
>>
>>
> ​I think that clustering part could be generalized to ML package as a
> whole.​

Fine I guess, since currently the "neuralnet" sub-package's only
concrete functionality is also a clustering method.

Regards,
Gilles

>
> ​Best regrads,
>             Artem Barger.​


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] Commons Math (r)evolution

Gilles Sadowski
In reply to this post by Eric Barnhill
On Mon, 6 Jun 2016 10:31:28 +0200, Eric Barnhill wrote:
> I am not a mathematician so I would not be able to play a
> particularly
> catholic role in commons-math.

I don't think that the majority of contributors would have
qualified themselves as "mathematician".

In the current situation, it would be useful to lay out
which options we have for the existing codebase.

> But, I am always delighted when my research
> needs allow me to spin off contributions into the code base.
>
>
> I work with complex valued 3 to 6-dimensional image volumes. So I am
> happy
> to maintain code involving complex numbers first of all, as well as
> investigate their integration their integration with Octave and
> ImgLib.

Which parts of Commons Math would be dependencies for this type
of applications?
Which algorithms of your applications would be generic enough to
warrant becoming part of a toolbox based on the "Complex" class?

> I am also interested in code for array-based math operations which is
> overwhelmingly how I compute. I would be happy to maintain that code
> and it
> does seem that now and again, suggests for how to refactor it come
> through
> JIRA.

Do you mean Java primitive "array", or the array concept like
implemented in CM's "RealVector"?

> I have my own home brewed libraries for syntactically convenient
> array wise operations that may also be of interest

How does this compare with Java 8 stream API?
Does it leverage multi-threading?

> once everyone is happy
> with the current state of the code base.

My view is that the current code base cannot be released unless
it is split supported components; and for those, I propose to
create dedicated Commons "components".

Do you agree?

Regards,
Gilles

> [...]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] Commons Math (r)evolution

Eric Barnhill
On Wed, Jun 8, 2016 at 12:08 AM, Gilles <[hidden email]>
wrote:


> Which parts of Commons Math would be dependencies for this type
> of applications?
> Which algorithms of your applications would be generic enough to
> warrant becoming part of a toolbox based on the "Complex" class?
>

It seems to me that the priority here is packaging up Complex into a
semiautonomous module. I'm happy to take that on. What I might add to it
later strikes me as something to address after the transition. However the
question of conforming some of the methods and data structures with C99
probably should be addressed as part of the transition (which is to say,
right now).


I am also interested in code for array-based math operations which is
>> overwhelmingly how I compute. I would be happy to maintain that code and
>> it
>> does seem that now and again, suggests for how to refactor it come through
>> JIRA.
>>
>
> Do you mean Java primitive "array", or the array concept like
> implemented in CM's "RealVector"?
>

Exactly. I am not sure I even have a full grasp on how primitive arrays,
MathArray objects, RealVectors, and objects like FieldMatrix all
inter-relate. Is it possible that together, these issues form a coherent
module of their own? Or could be re-factored in such a way that they do
form a sensible module, grounded in a very generic class like FieldElement?



> once everyone is happy
>> with the current state of the code base.
>>
>
> My view is that the current code base cannot be released unless
> it is split supported components; and for those, I propose to
> create dedicated Commons "components".
>
> Do you agree?
>

I am just a noob with no sense of the history but it's okay by me.

I also nominate myself to look after the interpolation libraries. I have
used them a lot and I really like the structure. It is a pleasure to use
them to compare interpolation methods for example. I'll be pursuing
nonsmooth and L1-related types of interpolation in the very near future. So
that seems like a straightforward expansion of the library I could
contribute.
Reply | Threaded
Open this post in threaded view
|

Re: [Math] Commons Math (r)evolution

Artem Barger
In reply to this post by Gilles Sadowski
On Wed, Jun 8, 2016 at 12:25 AM, Gilles <[hidden email]>
wrote:

>
> According to JIRA, among 180 issues currently targeted for the
>>> next major release (v4.0), 139 have been resolved (75 of which
>>> were not in v3.6.1).
>>>
>>>
>> ​Huh, it's above of 75% completion :)​
>>
>
> Everybody is welcome to review the "open" issues and comment
> about them.
>
>
​I guess someone need to prioritize them​ according to they importance for
release.


Of course, anyone who wishes to maintain some of these codes

>
>> (answer user questions, fix bugs, create enhancements, etc.)
>>> is most welcome to step forward.
>>>
>>>
>> ​I can try to cover some of these and maintain relevant code parts.​
>>
>
> Which ones?
>
> ​I will look into JIRA and provide the issue numbers, and of course I
can cover and assist with ML part and particular clustering.​




>
> IMO, a maintainer is someone who is able to respond to user
> questions and to figure out whether a bug report is valid.
>

​I'm subscribed for mailing list for quite a while and haven't seen a lot of
questions coming from user​s.



>>> ​I think that clustering part could be generalized to ML package as a
>> whole.​
>>
>
> Fine I guess, since currently the "neuralnet" sub-package's only
> concrete functionality is also a clustering method.
>
>
​I was also wondering whenever ML package meant to be extended in the future
with additional functionality, since I think I can provide my code for
several classification
algorithms​.
Reply | Threaded
Open this post in threaded view
|

[Math] About a new component based around "Complex" (Was: Commons Math (r)evolution)

Gilles Sadowski
In reply to this post by Eric Barnhill
Hi.

On Wed, 8 Jun 2016 09:30:24 +0200, Eric Barnhill wrote:

> On Wed, Jun 8, 2016 at 12:08 AM, Gilles
> <[hidden email]>
> wrote:
>
>
>> Which parts of Commons Math would be dependencies for this type
>> of applications?
>> Which algorithms of your applications would be generic enough to
>> warrant becoming part of a toolbox based on the "Complex" class?
>>
>
> It seems to me that the priority here is packaging up Complex into a
> semiautonomous module.

Why "semi"?

> I'm happy to take that on.

Thanks a lot.

> What I might add to it
> later strikes me as something to address after the transition.

Sure.

But it might help provide guidelines for desirable design changes
or choices.

> However the
> question of conforming some of the methods and data structures with
> C99
> probably should be addressed as part of the transition (which is to
> say,
> right now).

It's certainly desirable to do it before the initial release of this
new would-be component.

But the first thing is to decouple whatever would go into the new
module
from everything that wouldn't.

To be probably included: complex solvers (currently in
"o.a.c.m.analysis.solvers").

>>> I am also interested in code for array-based math operations which
>>> is
>>> overwhelmingly how I compute. I would be happy to maintain that
>>> code and
>>> it
>>> does seem that now and again, suggests for how to refactor it come
>>> through
>>> JIRA.
>>>
>>
>> Do you mean Java primitive "array", or the array concept like
>> implemented in CM's "RealVector"?
>>
>
> Exactly. I am not sure I even have a full grasp on how primitive
> arrays,
> MathArray objects, RealVectors, and objects like FieldMatrix all
> inter-relate. Is it possible that together, these issues form a
> coherent
> module of their own?

I doubt it.
As they are currently, "MathArrays" and "RealVector" are at opposite
ends of
the design spectrum: the former mostly contains C-like functions
operating
on primitive arrays, the latter is OO.
The focus of "MathArrays" was speed (and also implement methods that
were
available in Java 6.
The API of "RealVector" is a mix between a "list of numbers", "single
row or
column matrix", "Cartesian coordinates" (IMHO a design mistake that
should
not be repeated).

Depending on the expected applications, some choices might be crucial.
E.g.: fixed or variable dimension (impacting
immutability/thread-safety)?

> Or could be re-factored in such a way that they do
> form a sensible module, grounded in a very generic class like
> FieldElement?

I guess that it depends on the intended applications.
Even if the generalization sounds appealing, I'd be wary of creating a
mathematical framework which nobody would need.
The more so that this would probably involve inheritance, which is more
and more decried as the weakest point of OO.

Do you have examples in mind?

>> once everyone is happy
>>> with the current state of the code base.
>>>
>>
>> My view is that the current code base cannot be released unless
>> it is split supported components; and for those, I propose to
>> create dedicated Commons "components".
>>
>> Do you agree?
>>
>
> I am just a noob with no sense of the history but it's okay by me.
>
> I also nominate myself to look after the interpolation libraries. I
> have
> used them a lot and I really like the structure. It is a pleasure to
> use
> them to compare interpolation methods for example. I'll be pursuing
> nonsmooth and L1-related types of interpolation in the very near
> future. So
> that seems like a straightforward expansion of the library I could
> contribute.

Great.
Another candidate component; I assume.
[If so, in another thread.  Again first thing would probably be to
decouple it from the legacy code.]


Regards,
Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] Commons Math (r)evolution

Gilles Sadowski
In reply to this post by Artem Barger
Hi.

On Wed, 8 Jun 2016 23:50:00 +0300, Artem Barger wrote:

> On Wed, Jun 8, 2016 at 12:25 AM, Gilles
> <[hidden email]>
> wrote:
>
>>
>> According to JIRA, among 180 issues currently targeted for the
>>>> next major release (v4.0), 139 have been resolved (75 of which
>>>> were not in v3.6.1).
>>>>
>>>>
>>> ​Huh, it's above of 75% completion :)​
>>>
>>
>> Everybody is welcome to review the "open" issues and comment
>> about them.
>>
>>
> ​I guess someone need to prioritize them​ according to they
> importance for
> release.

Importance is relative... :-}

IMO, it is important to not release unsupported code.
So the priority would be higher for issues that would be included
in the release of the new Commons components.
Hence the need to figure out what these components will be.

>>>> Of course, anyone who wishes to maintain some of these codes
>>>>
>>>> (answer user questions, fix bugs, create enhancements, etc.)
>>>> is most welcome to step forward.
>>>>
>>>>
>>> ​I can try to cover some of these and maintain relevant code
>>> parts.​
>>>
>>
>> Which ones?
>>
> ​I will look into JIRA and provide the issue numbers, and of course I
> can cover and assist with ML part and particular clustering.​

Thanks.

>>
>> IMO, a maintainer is someone who is able to respond to user
>> questions and to figure out whether a bug report is valid.
>>
>
> ​I'm subscribed for mailing list for quite a while and haven't
> seen a lot of questions coming from user​s.

The "user" ML has always been fairly quiet.
Does it mean that the code is really easy to use?
Or feature-complete (I doubt that)?
Or that there are very few users for the most complex features?

The "dev" ML was usually (much) more active.

The point is that when someone asks a question or propose an
contribution, there must be someone to answer.

>>>> ​I think that clustering part could be generalized to ML package
>>>> as a
>>> whole.​
>>>
>>
>> Fine I guess, since currently the "neuralnet" sub-package's only
>> concrete functionality is also a clustering method.
>>
>>
> ​I was also wondering whenever ML package meant to be extended in
> the future

Really there was no plan, or as many plans as there were developers...

Putting all these codes (with different designs, different coding
practices, different intended audiences, different levels of expertise,
etc.) in a single library was not sustainable.

That's why I strongly favour cutting this monolith into pieces
with a limited scope.

> with additional functionality, since I think I can provide my code
> for
> several classification
> algorithms​.

That sounds nice.
Which algorithms?

Regards,
Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] Commons Math (r)evolution

Jörg Schaible-5
Hi Gilles,

Gilles wrote:

> Hi.
>
> On Wed, 8 Jun 2016 23:50:00 +0300, Artem Barger wrote:
>> On Wed, Jun 8, 2016 at 12:25 AM, Gilles
>> <[hidden email]>
>> wrote:
>>
>>>
>>> According to JIRA, among 180 issues currently targeted for the
>>>>> next major release (v4.0), 139 have been resolved (75 of which
>>>>> were not in v3.6.1).
>>>>>
>>>>>
>>>> ​Huh, it's above of 75% completion :)​
>>>>
>>>
>>> Everybody is welcome to review the "open" issues and comment
>>> about them.
>>>
>>>
>> ​I guess someone need to prioritize them​ according to they
>> importance for
>> release.
>
> Importance is relative... :-}
>
> IMO, it is important to not release unsupported code.

Unit test *are* kind of support.

> So the priority would be higher for issues that would be included
> in the release of the new Commons components.
> Hence the need to figure out what these components will be.
>
>>>>> Of course, anyone who wishes to maintain some of these codes
>>>>>
>>>>> (answer user questions, fix bugs, create enhancements, etc.)
>>>>> is most welcome to step forward.
>>>>>
>>>>>
>>>> ​I can try to cover some of these and maintain relevant code
>>>> parts.​
>>>>
>>>
>>> Which ones?
>>>
>> ​I will look into JIRA and provide the issue numbers, and of course I
>> can cover and assist with ML part and particular clustering.​
>
> Thanks.
>
>>>
>>> IMO, a maintainer is someone who is able to respond to user
>>> questions and to figure out whether a bug report is valid.
>>>
>>
>> ​I'm subscribed for mailing list for quite a while and haven't
>> seen a lot of questions coming from user​s.
>
> The "user" ML has always been fairly quiet.
> Does it mean that the code is really easy to use?
> Or feature-complete (I doubt that)?
> Or that there are very few users for the most complex features?
>
> The "dev" ML was usually (much) more active.
>
> The point is that when someone asks a question or propose an
> contribution, there must be someone to answer.

And this is IMHO a wrong assumption. We have a lot of components where the
original authors have left long ago. So the situation is not new.

Math is a specialized library and nobody expects that it is accompanied by
tutorials explaining the theory or developers that act as trainers here on
the lists. Users of special algorithms are supposed to be experts themselves
and should understand what they are doing. Or do you expect that any
arbitrary user can use genetic algorithms or neuronal network stuff without
the mathematical background?

Anything is well and can be released as long as the existing code is
verified by unit tests. Otherwise we would have to remove a lot of code
every time we release a component ... or do you expect e.g. that the release
manager of vfs understands completely any of its providers?

>>>>> ​I think that clustering part could be generalized to ML package
>>>>> as a
>>>> whole.​
>>>>
>>>
>>> Fine I guess, since currently the "neuralnet" sub-package's only
>>> concrete functionality is also a clustering method.
>>>
>>>
>> ​I was also wondering whenever ML package meant to be extended in
>> the future
>
> Really there was no plan, or as many plans as there were developers...
>
> Putting all these codes (with different designs, different coding
> practices, different intended audiences, different levels of expertise,
> etc.) in a single library was not sustainable.
>
> That's why I strongly favour cutting this monolith into pieces
> with a limited scope.

Nobody objects, but if you look at vfs, it is still *one* Apache Commons
component, just with multiple artifacts. All these artifacts are released
*together*. Turning math into a multi-project has nothing to do with your
plans to drop mature code, because you (and currently no-one else) cannot
answer questions to its functionality.

Cheers,
Jörg


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] Commons Math (r)evolution

Artem Barger
In reply to this post by Gilles Sadowski
 On Thu, Jun 9, 2016 at 1:54 AM, Gilles <[hidden email]>
wrote:

> ​I guess someone need to prioritize them​ according to they importance for
>
>> release.
>>
>
> Importance is relative... :-}
>

​Indeed it's very objective function,  however someone has to decide
where to focus.​



> IMO, it is important to not release unsupported code.
> So the priority would be higher for issues that would be included
> in the release of the new Commons components.
> Hence the need to figure out what these components will be.
>

​Not clear whenever you really mean by not releasing unsupported code is
to exclude already existing parts which doesn't anyone who will be capable
to maintain the functionality and solve possible bugs?​



>
>>> Which ones?
>>>
>>> ​I will look into JIRA and provide the issue numbers, and of course I
>> can cover and assist with ML part and particular clustering.​
>>
>
> Thanks.
>

​You are welcome :)
So I looked through some of the open issues and I have a couple of
questions them​:

1. Which affected version do I really need to consider as an important to
be
released in the next version?
2. I've looked into affected versions: 3.5, 3.6, 3.6.1, 3.7 and 4.0.
Overall found
something like ~25 open bugs/issues. What about issues opened for lower
releases?

For example while I'm looking into MATH-1329
<https://issues.apache.org/jira/browse/MATH-1329>, it sounds not really
hard to have it fixed,
but looking on it not sure whenever it was accepted and approved as
something that need
to be done.

Next MATH-1315 <https://issues.apache.org/jira/browse/MATH-1315> seems like
reported has provided the patch and the only thing
missing is unit test, will addition of unit tests help to make that one
resolved?

MATH-1284 <https://issues.apache.org/jira/browse/MATH-1284> here is not
clear what is the final decision, according to comments look
like it can be resolved.

Here MATH-1262 <https://issues.apache.org/jira/browse/MATH-1262> according
to the comments editing current javadoc to explain the limitation
should also rest this case, am I missing something?

I think I can help to handle MATH-1281
<https://issues.apache.org/jira/browse/MATH-1281>, right after I will
finish the refactoring and
optimizations for kmeans clustering.

​Listed issues/tickets where I can help to provide support or fix, most
likely missed something
but can start with these.​


> with additional functionality, since I think I can provide my code for
>> several classification
>> algorithms​.
>>
>
> That sounds nice.
> Which algorithms?
>
>
​Naive Bayes, kNN, Decision Trees, Random Forest. I guess adding these into
the project will require
serious redesign of ML package​.
Reply | Threaded
Open this post in threaded view
|

Re: [Math] Commons Math (r)evolution

Gilles Sadowski
In reply to this post by Jörg Schaible-5
Hello Jörg.

On Thu, 09 Jun 2016 09:43:06 +0200, Jörg Schaible wrote:

> Hi Gilles,
>
> Gilles wrote:
>
>> Hi.
>>
>> On Wed, 8 Jun 2016 23:50:00 +0300, Artem Barger wrote:
>>> On Wed, Jun 8, 2016 at 12:25 AM, Gilles
>>> <[hidden email]>
>>> wrote:
>>>
>>>>
>>>> According to JIRA, among 180 issues currently targeted for the
>>>>>> next major release (v4.0), 139 have been resolved (75 of which
>>>>>> were not in v3.6.1).
>>>>>>
>>>>>>
>>>>> ​Huh, it's above of 75% completion :)​
>>>>>
>>>>
>>>> Everybody is welcome to review the "open" issues and comment
>>>> about them.
>>>>
>>>>
>>> ​I guess someone need to prioritize them​ according to they
>>> importance for
>>> release.
>>
>> Importance is relative... :-}
>>
>> IMO, it is important to not release unsupported code.
>
> Unit test *are* kind of support.

Unit tests are not what I mean by "support".  They only increase the
probability that the code behaves as expected. [And sometimes they do
not because they can be buggy too, as I discovered when refactoring
the "random" package.]

But anyways, my reservations have nothing to do with the functionality
of released code: users who are satisfied with the service provided by
v3.6.1 (or any of the previous versions of CM) have no reason to
upgrade
to 4.0.  [By upgrading, all they get is the obligation to change the
"import" statements.]

And we have no reason to release a v4.0 of a code that
  1. has not changed
  2. is not supported

>> So the priority would be higher for issues that would be included
>> in the release of the new Commons components.
>> Hence the need to figure out what these components will be.
>>
>>>>>> Of course, anyone who wishes to maintain some of these codes
>>>>>>
>>>>>> (answer user questions, fix bugs, create enhancements, etc.)
>>>>>> is most welcome to step forward.
>>>>>>
>>>>>>
>>>>> ​I can try to cover some of these and maintain relevant code
>>>>> parts.​
>>>>>
>>>>
>>>> Which ones?
>>>>
>>> ​I will look into JIRA and provide the issue numbers, and of course
>>> I
>>> can cover and assist with ML part and particular clustering.​
>>
>> Thanks.
>>
>>>>
>>>> IMO, a maintainer is someone who is able to respond to user
>>>> questions and to figure out whether a bug report is valid.
>>>>
>>>
>>> ​I'm subscribed for mailing list for quite a while and haven't
>>> seen a lot of questions coming from user​s.
>>
>> The "user" ML has always been fairly quiet.
>> Does it mean that the code is really easy to use?
>> Or feature-complete (I doubt that)?
>> Or that there are very few users for the most complex features?
>>
>> The "dev" ML was usually (much) more active.
>>
>> The point is that when someone asks a question or propose an
>> contribution, there must be someone to answer.
>
> And this is IMHO a wrong assumption. We have a lot of components
> where the
> original authors have left long ago. So the situation is not new.

Having no support is bad (IMO).
[It doesn't have to be from the original authors of course.]

> Math is a specialized library and nobody expects that it is
> accompanied by
> tutorials explaining the theory or developers that act as trainers
> here on
> the lists. Users of special algorithms are supposed to be experts
> themselves
> and should understand what they are doing. Or do you expect that any
> arbitrary user can use genetic algorithms or neuronal network stuff
> without
> the mathematical background?

No, I do not expect that.
[Although it is sometimes part of the resolution of a bug report, and
something that gives a sense of "you are welcome here".]

The main point is about real bugs that won't be handled (see below).

> Anything is well and can be released as long as the existing code is
> verified by unit tests. Otherwise we would have to remove a lot of
> code
> every time we release a component ... or do you expect e.g. that the
> release
> manager of vfs understands completely any of its providers?

No, certainly not, since I could RM CM. ;-)

But that's not the point!

_Some_ developer(s) should be able to support whatever is in
development.
Otherwise how can it be deemed "in development"?

Just today, two issues were reported on JIRA:
   https://issues.apache.org/jira/browse/MATH-172
   https://issues.apache.org/jira/browse/MATH-1375

They, unfortunately, illustrate my point.

Moreover what could be true for VFS is not for CM where there are many,
many different areas that have nothing in common (except perhaps some
ubiquitous very-low utilities which might be worth their own component
to serve as a, maybe "internal", dependency).

Also, compare the source basic statistics (lines of code):
               VFS      CM
Java code    24215   90834
Unit tests    8926   95595

All in all, CM is more than 5 times larger than VFS (not even counting
documentation).

>>>>>> ​I think that clustering part could be generalized to ML package
>>>>>> as a
>>>>> whole.​
>>>>>
>>>>
>>>> Fine I guess, since currently the "neuralnet" sub-package's only
>>>> concrete functionality is also a clustering method.
>>>>
>>>>
>>> ​I was also wondering whenever ML package meant to be extended in
>>> the future
>>
>> Really there was no plan, or as many plans as there were
>> developers...
>>
>> Putting all these codes (with different designs, different coding
>> practices, different intended audiences, different levels of
>> expertise,
>> etc.) in a single library was not sustainable.
>>
>> That's why I strongly favour cutting this monolith into pieces
>> with a limited scope.
>
> Nobody objects, but if you look at vfs, it is still *one* Apache
> Commons
> component, just with multiple artifacts. All these artifacts are
> released
> *together*.

Sorry I'm lost, I looked there:
   http://commons.apache.org/proper/commons-vfs/download_vfs.cgi

And, it seems that all the functionality is in a single JAR.
[Other files contain the sources, tests, examples.]

Anyways, it is obvious that, in VFS, there is a well defined scope
(a unifying rationale).

No such thing in CM.

What I want to achieve is indeed to create a set of components that are
more like VFS!

This is particularly obvious with the RNGs where there is one unifying
interface, a factory method and multiple implementations.
[Of course, in that case, the new component will be much simpler than
VFS (which is a "good thing", isn't it?).]

> Turning math into a multi-project has nothing to do with your
> plans to drop mature code,

I am not dropping anything (others did that); I am stating facts and I
now want to spend my time on something (hopefully) worth it.  [Working
to modularize unsupported code is a (huge) waste of time.]

Also, in the case of CM, "mature code" is meaningless as an overall
qualifier: some codes are
  * new (and never released, e.g. 64-bits-based RNGs)
  * algorithms introduced relatively recently (and perhaps never used)
  * old (and sometimes outdated and impossible to fix without breaking
    compatibility)
  * mostly functional (but impossible to maintain, cf. MATH-1375)
  * resulting from a refactoring (hence even when the functionality has
    existed for a long time, the code is not "mature")

IMHO, maturity should be visible in the code.  It's an impression that
builds up by looking at the code as a whole, and coming to the
conclusion
that indeed there is some overall consistency across files and
packages.

Within some CM packages: yes (even if "mature" would certainly not mean
free of sometimes serious problems).

Across the whole library: certainly *not*.
[For reasons I could expand on.  But I did several times (cf. archives)
without succeeding in changing course.]

> because you (and currently no-one else) cannot
> answer questions to its functionality.

See the first post in this thread, in the part about gradually
re-adding
codes if and when they are supported by a new team.


Regards,
Gilles


> Cheers,
> Jörg


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] Commons Math (r)evolution

Ralph Goers

> On Jun 9, 2016, at 2:12 PM, Gilles <[hidden email]> wrote:
>
> Hello Jörg.
>
> On Thu, 09 Jun 2016 09:43:06 +0200, Jörg Schaible wrote:
>> Hi Gilles,
>>
>> Gilles wrote:
>>
>>> Hi.
>>>
>>> On Wed, 8 Jun 2016 23:50:00 +0300, Artem Barger wrote:
>>>> On Wed, Jun 8, 2016 at 12:25 AM, Gilles
>>>> <[hidden email]>
>>>> wrote:
>>>>
>>>>>
>>>>> According to JIRA, among 180 issues currently targeted for the
>>>>>>> next major release (v4.0), 139 have been resolved (75 of which
>>>>>>> were not in v3.6.1).
>>>>>>>
>>>>>>>
>>>>>> ​Huh, it's above of 75% completion :)​
>>>>>>
>>>>>
>>>>> Everybody is welcome to review the "open" issues and comment
>>>>> about them.
>>>>>
>>>>>
>>>> ​I guess someone need to prioritize them​ according to they
>>>> importance for
>>>> release.
>>>
>>> Importance is relative... :-}
>>>
>>> IMO, it is important to not release unsupported code.
>>
>> Unit test *are* kind of support.
>
> Unit tests are not what I mean by "support".  They only increase the
> probability that the code behaves as expected. [And sometimes they do
> not because they can be buggy too, as I discovered when refactoring
> the "random" package.]

Now that is a funny argument.  If you can write a proper unit test for the code typically you understand what the code is doing and could fix it if needed.

>
> But anyways, my reservations have nothing to do with the functionality
> of released code: users who are satisfied with the service provided by
> v3.6.1 (or any of the previous versions of CM) have no reason to upgrade
> to 4.0.  [By upgrading, all they get is the obligation to change the
> "import" statements.]
>
> And we have no reason to release a v4.0 of a code that
> 1. has not changed
> 2. is not supported

What you seem to be proposing is tossing code that “isn’t supported” even if it works just fine. I don’t understand why you would want to do that.  

What I am seeing here is a bunch of people coming on board who seem to really want to help and get involved. Before doing radical things like dumping a large portion of the code base please take the time to see how things play out.

Ralph



Reply | Threaded
Open this post in threaded view
|

Re: [Math] Commons Math (r)evolution

garydgregory
On Thu, Jun 9, 2016 at 2:53 PM, Ralph Goers <[hidden email]>
wrote:

>
> > On Jun 9, 2016, at 2:12 PM, Gilles <[hidden email]> wrote:
> >
> > Hello Jörg.
> >
> > On Thu, 09 Jun 2016 09:43:06 +0200, Jörg Schaible wrote:
> >> Hi Gilles,
> >>
> >> Gilles wrote:
> >>
> >>> Hi.
> >>>
> >>> On Wed, 8 Jun 2016 23:50:00 +0300, Artem Barger wrote:
> >>>> On Wed, Jun 8, 2016 at 12:25 AM, Gilles
> >>>> <[hidden email]>
> >>>> wrote:
> >>>>
> >>>>>
> >>>>> According to JIRA, among 180 issues currently targeted for the
> >>>>>>> next major release (v4.0), 139 have been resolved (75 of which
> >>>>>>> were not in v3.6.1).
> >>>>>>>
> >>>>>>>
> >>>>>> ​Huh, it's above of 75% completion :)​
> >>>>>>
> >>>>>
> >>>>> Everybody is welcome to review the "open" issues and comment
> >>>>> about them.
> >>>>>
> >>>>>
> >>>> ​I guess someone need to prioritize them​ according to they
> >>>> importance for
> >>>> release.
> >>>
> >>> Importance is relative... :-}
> >>>
> >>> IMO, it is important to not release unsupported code.
> >>
> >> Unit test *are* kind of support.
> >
> > Unit tests are not what I mean by "support".  They only increase the
> > probability that the code behaves as expected. [And sometimes they do
> > not because they can be buggy too, as I discovered when refactoring
> > the "random" package.]
>
> Now that is a funny argument.  If you can write a proper unit test for the
> code typically you understand what the code is doing and could fix it if
> needed.
>
> >
> > But anyways, my reservations have nothing to do with the functionality
> > of released code: users who are satisfied with the service provided by
> > v3.6.1 (or any of the previous versions of CM) have no reason to upgrade
> > to 4.0.  [By upgrading, all they get is the obligation to change the
> > "import" statements.]
> >
> > And we have no reason to release a v4.0 of a code that
> > 1. has not changed
> > 2. is not supported
>
> What you seem to be proposing is tossing code that “isn’t supported” even
> if it works just fine. I don’t understand why you would want to do that.
>
> What I am seeing here is a bunch of people coming on board who seem to
> really want to help and get involved. Before doing radical things like
> dumping a large portion of the code base please take the time to see how
> things play out.
>

+1

Gary

>
> Ralph
>
>
>
>


--
E-Mail: [hidden email] | [hidden email]
Java Persistence with Hibernate, Second Edition
<http://www.manning.com/bauer3/>
JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
Spring Batch in Action <http://www.manning.com/templier/>
Blog: http://garygregory.wordpress.com
Home: http://garygregory.com/
Tweet! http://twitter.com/GaryGregory
Reply | Threaded
Open this post in threaded view
|

Re: [Math] Commons Math (r)evolution

Gilles Sadowski
In reply to this post by Artem Barger
Hi.

On Thu, 9 Jun 2016 18:02:49 +0300, Artem Barger wrote:

> On Thu, Jun 9, 2016 at 1:54 AM, Gilles <[hidden email]>
> wrote:
>
>> ​I guess someone need to prioritize them​ according to they
>> importance for
>>
>>> release.
>>>
>>
>> Importance is relative... :-}
>>
>
> ​Indeed it's very objective function,  however someone has to decide
> where to focus.​

Please list the alternatives, and I will let you know what are my
preferences.
But anyone can decide what he will do or not...

>> IMO, it is important to not release unsupported code.
>> So the priority would be higher for issues that would be included
>> in the release of the new Commons components.
>> Hence the need to figure out what these components will be.
>>
>
> ​Not clear whenever you really mean by not releasing unsupported code
> is
> to exclude already existing parts which doesn't anyone who will be
> capable
> to maintain the functionality and solve possible bugs?​

I did not understand this sentence, sorry. :-}

>>
>>>> Which ones?
>>>>
>>>> ​I will look into JIRA and provide the issue numbers, and of
>>>> course I
>>> can cover and assist with ML part and particular clustering.​
>>>
>>
>> Thanks.
>>
>
> ​You are welcome :)
> So I looked through some of the open issues and I have a couple of
> questions them​:
>
> 1. Which affected version do I really need to consider as an
> important to
> be released in the next version?

It depends what is the contents of the release.
Certainly (IMO), issues in classes that are not going to be released
any
time soon, are low priority.

> 2. I've looked into affected versions: 3.5, 3.6, 3.6.1, 3.7 and 4.0.
> Overall found
> something like ~25 open bugs/issues. What about issues opened for
> lower
> releases?

Theses are probably the hardest to fix (often it relates to the
refactoring of whole packages like e.g. MATH-756).

As I've argued in this thread, my preference is to focus on extracting
independent functionalities and turn them into new components.

The point is not to about making negative choices (a.k.a. "dropping
code")
but making positive choices as to what you'd include in a given
component.

> For example while I'm looking into MATH-1329
> <https://issues.apache.org/jira/browse/MATH-1329>, it sounds not
> really
> hard to have it fixed,
> but looking on it not sure whenever it was accepted and approved as
> something that need
> to be done.
>
> Next MATH-1315 <https://issues.apache.org/jira/browse/MATH-1315>
> seems like
> reported has provided the patch and the only thing
> missing is unit test, will addition of unit tests help to make that
> one
> resolved?
>
> MATH-1284 <https://issues.apache.org/jira/browse/MATH-1284> here is
> not
> clear what is the final decision, according to comments look
> like it can be resolved.
>
> Here MATH-1262 <https://issues.apache.org/jira/browse/MATH-1262>
> according
> to the comments editing current javadoc to explain the limitation
> should also rest this case, am I missing something?
>
> I think I can help to handle MATH-1281
> <https://issues.apache.org/jira/browse/MATH-1281>, right after I will
> finish the refactoring and
> optimizations for kmeans clustering.

[Please start a new thread for each of the specific issues above.
This list holds many conversations at the same time, and if multiple
subject are handled within a single thread, it quickly becomes
impossible to follow.  Thanks!]

> ​Listed issues/tickets where I can help to provide support or fix,
> most
> likely missed something
> but can start with these.​

It's up to you really which concrete issues you want to fix.

>> with additional functionality, since I think I can provide my code
>> for
>>> several classification
>>> algorithms​.
>>>
>>
>> That sounds nice.
>> Which algorithms?
>>
>>
> ​Naive Bayes, kNN, Decision Trees, Random Forest. I guess adding
> these into
> the project will require
> serious redesign of ML package​.

Again, up to you to propose something new. :-)
IMHO, it is extremely important to have a clean design, and clear goals
(e.g. support for parallel computing).


Regards,
Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] Commons Math (r)evolution

Gilles Sadowski
In reply to this post by Ralph Goers
On Thu, 9 Jun 2016 14:53:20 -0700, Ralph Goers wrote:

>> On Jun 9, 2016, at 2:12 PM, Gilles <[hidden email]>
>> wrote:
>>
>> Hello Jörg.
>>
>> On Thu, 09 Jun 2016 09:43:06 +0200, Jörg Schaible wrote:
>>> Hi Gilles,
>>>
>>> Gilles wrote:
>>>
>>>> Hi.
>>>>
>>>> On Wed, 8 Jun 2016 23:50:00 +0300, Artem Barger wrote:
>>>>> On Wed, Jun 8, 2016 at 12:25 AM, Gilles
>>>>> <[hidden email]>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> According to JIRA, among 180 issues currently targeted for the
>>>>>>>> next major release (v4.0), 139 have been resolved (75 of which
>>>>>>>> were not in v3.6.1).
>>>>>>>>
>>>>>>>>
>>>>>>> ​Huh, it's above of 75% completion :)​
>>>>>>>
>>>>>>
>>>>>> Everybody is welcome to review the "open" issues and comment
>>>>>> about them.
>>>>>>
>>>>>>
>>>>> ​I guess someone need to prioritize them​ according to they
>>>>> importance for
>>>>> release.
>>>>
>>>> Importance is relative... :-}
>>>>
>>>> IMO, it is important to not release unsupported code.
>>>
>>> Unit test *are* kind of support.
>>
>> Unit tests are not what I mean by "support".  They only increase the
>> probability that the code behaves as expected. [And sometimes they
>> do
>> not because they can be buggy too, as I discovered when refactoring
>> the "random" package.]
>
> Now that is a funny argument.  If you can write a proper unit test
> for the code typically you understand what the code is doing and
> could
> fix it if needed.

Yes.  Where did I say otherwise?

>>
>> But anyways, my reservations have nothing to do with the
>> functionality
>> of released code: users who are satisfied with the service provided
>> by
>> v3.6.1 (or any of the previous versions of CM) have no reason to
>> upgrade
>> to 4.0.  [By upgrading, all they get is the obligation to change the
>> "import" statements.]
>>
>> And we have no reason to release a v4.0 of a code that
>> 1. has not changed
>> 2. is not supported
>
> What you seem to be proposing is tossing code that “isn’t supported”
> even if it works just fine. I don’t understand why you would want to
> do that.

No, you misunderstood: I want to work on, and release, code which we
can support.
Code which we can't support will stay in the "develop" branch until
someone feels confident to release it.

> What I am seeing here is a bunch of people coming on board who seem
> to really want to help and get involved. Before doing radical things
> like dumping a large portion of the code base please take the time to
> see how things play out.

I did take the time (and I'm not going to continue wasting my time the
way I did all these years).

I just gave two examples of unsolvable issues due to code being
unsupported.  Please let us know how you'd handle them.

Also, to conclude a discussion that go in circles: I'm not preventing
(how could I?) you to release CM v4.0 with whatever contents you see
fit.
I gave my arguments for discouraging such a step and others in favour
of my preferred alternative, which I'd like to start working on,
concretely, in order to see how it'll play out indeed.

Gilles

> Ralph


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

123