[GSoC][Commons][STATISTICS][Regression][Matrix] Separate module for StatisticsMatrix (simple extension of EJML's SimpleBase) in commons statistics?

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

[GSoC][Commons][STATISTICS][Regression][Matrix] Separate module for StatisticsMatrix (simple extension of EJML's SimpleBase) in commons statistics?

Ben Nguyen
Hello,

Mr. Gilles Sadowski suggested to me on Slack that StatisticsMatrix and future extensions of EJML’s code should go into it’s own component. So based on my understanding; should there be a general matrix module to use inside of commons statistics which uses the EJML? Does anyone think another statistics component (besides regression) will need matrices and it’s operations?

Thank you for your input,
Cheers,
-Ben Nguyen

Reply | Threaded
Open this post in threaded view
|

Re: [GSoC][Commons][STATISTICS][Regression][Matrix] Separate module for StatisticsMatrix (simple extension of EJML's SimpleBase) in commons statistics?

Gilles Sadowski-2
Hi.

Le ven. 21 juin 2019 à 14:38, Ben Nguyen <[hidden email]> a écrit :
>
> Hello,
>
> Mr. Gilles Sadowski suggested to me on Slack that StatisticsMatrix and future extensions of EJML’s code should go into it’s own component.

Not exactly; I suggested that
1. there be an interface defined in [Statistics] for matrix that would
shield its API
from a future change of its implementation. [Now it can be a subclass of EJML,
but what if we want to change later?  Do we want to support an external API
even when it's not used to perform the computations?]
2. utilities (like the matrix interface) that can be used by several modules
of [Statistics] are best defined in a separate (maven) module.

> So based on my understanding; should there be a general matrix module to use inside of commons statistics which uses the EJML?

Which matrix functionalities are needed for the "regression" module?

> Does anyone think another statistics component (besides regression) will need matrices and it’s operations?

You could get the answer by looking at the [Math] codes.

Regards,
Gilles

>
> Thank you for your input,
> Cheers,
> -Ben Nguyen
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: [GSoC][Commons][STATISTICS][Regression][Matrix] Separate modulefor StatisticsMatrix (simple extension of EJML's SimpleBase) in commonsstatistics?

Ben Nguyen
Hi,
The CM regression module uses LU & QR decomposition and basic matrix operations like multiply, add/subtract, transpose, inverse, as well as functionalities like getting a submatrix, getting dimensions etc…. All of which EJML provides as far as I’ve looked. But I also expect there to be perhaps large differences in the port due to Streams….

Cheers,
-Ben

From: Gilles Sadowski
Sent: Friday, June 21, 2019 8:18 PM
To: Commons Developers List
Subject: Re: [GSoC][Commons][STATISTICS][Regression][Matrix] Separate modulefor StatisticsMatrix (simple extension of EJML's SimpleBase) in commonsstatistics?

Hi.

Le ven. 21 juin 2019 à 14:38, Ben Nguyen <[hidden email]> a écrit :
>
> Hello,
>
> Mr. Gilles Sadowski suggested to me on Slack that StatisticsMatrix and future extensions of EJML’s code should go into it’s own component.

Not exactly; I suggested that
1. there be an interface defined in [Statistics] for matrix that would
shield its API
from a future change of its implementation. [Now it can be a subclass of EJML,
but what if we want to change later?  Do we want to support an external API
even when it's not used to perform the computations?]
2. utilities (like the matrix interface) that can be used by several modules
of [Statistics] are best defined in a separate (maven) module.

> So based on my understanding; should there be a general matrix module to use inside of commons statistics which uses the EJML?

Which matrix functionalities are needed for the "regression" module?

> Does anyone think another statistics component (besides regression) will need matrices and it’s operations?

You could get the answer by looking at the [Math] codes.

Regards,
Gilles

>
> Thank you for your input,
> Cheers,
> -Ben Nguyen
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


Reply | Threaded
Open this post in threaded view
|

[All][STATISTICS] External dependency for linear algebra?

Gilles Sadowski-2
Hello.

[I've changed the subject line to reflect that we are discussing
something at the the project's policy level (not just [Statistics]).]

Le sam. 22 juin 2019 à 05:16, Ben Nguyen <[hidden email]> a écrit :
>
> Hi,
> The CM regression module uses LU & QR decomposition and basic matrix operations like multiply, add/subtract, transpose, inverse, as well as functionalities like getting a submatrix, getting dimensions etc…. All of which EJML provides as far as I’ve looked.

That's fine that EJML is a suitable candidate; however it would be nice
to record somewhere why it is currently the best choice.  [It could just be
a recommendation from people who've used been using it rather than the
contenders, but it should be formally agreed on for *some* reason.]

However, the main issue is whether we add explicit runtime dependency
to EJML's artefact.  IIUC, the consequences are:
 * Requirement to support it for as long as we don't change major version.
 * Risk of JAR hell.

Alternative is:
 * Create custom interface(s) for linear algebra (to be currently implemented
by an *internal* wrapper around the EJML functionalities).
 * Use the shade plugin so that the dependency is compile-time only.

Comments, preferences, other suggestions?

Thanks,
Gilles

> But I also expect there to be perhaps large differences in the port due to Streams….
>
> Cheers,
> -Ben
>
> From: Gilles Sadowski
> Sent: Friday, June 21, 2019 8:18 PM
> To: Commons Developers List
> Subject: Re: [GSoC][Commons][STATISTICS][Regression][Matrix] Separate modulefor StatisticsMatrix (simple extension of EJML's SimpleBase) in commonsstatistics?
>
> Hi.
>
> Le ven. 21 juin 2019 à 14:38, Ben Nguyen <[hidden email]> a écrit :
> >
> > Hello,
> >
> > Mr. Gilles Sadowski suggested to me on Slack that StatisticsMatrix and future extensions of EJML’s code should go into it’s own component.
>
> Not exactly; I suggested that
> 1. there be an interface defined in [Statistics] for matrix that would
> shield its API
> from a future change of its implementation. [Now it can be a subclass of EJML,
> but what if we want to change later?  Do we want to support an external API
> even when it's not used to perform the computations?]
> 2. utilities (like the matrix interface) that can be used by several modules
> of [Statistics] are best defined in a separate (maven) module.
>
> > So based on my understanding; should there be a general matrix module to use inside of commons statistics which uses the EJML?
>
> Which matrix functionalities are needed for the "regression" module?
>
> > Does anyone think another statistics component (besides regression) will need matrices and it’s operations?
>
> You could get the answer by looking at the [Math] codes.
>
> Regards,
> Gilles
>
> >
> > Thank you for your input,
> > Cheers,
> > -Ben Nguyen
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [All][STATISTICS] External dependency for linear algebra?

garydgregory
My two bits:
- What is the license of the third party artifact under consideration?
- Shading introduces more problems than it is worth and forces bloating.
Use a normal Maven dependency

Gary

On Sat, Jun 22, 2019 at 9:56 AM Gilles Sadowski <[hidden email]>
wrote:

> Hello.
>
> [I've changed the subject line to reflect that we are discussing
> something at the the project's policy level (not just [Statistics]).]
>
> Le sam. 22 juin 2019 à 05:16, Ben Nguyen <[hidden email]> a écrit :
> >
> > Hi,
> > The CM regression module uses LU & QR decomposition and basic matrix
> operations like multiply, add/subtract, transpose, inverse, as well as
> functionalities like getting a submatrix, getting dimensions etc…. All of
> which EJML provides as far as I’ve looked.
>
> That's fine that EJML is a suitable candidate; however it would be nice
> to record somewhere why it is currently the best choice.  [It could just be
> a recommendation from people who've used been using it rather than the
> contenders, but it should be formally agreed on for *some* reason.]
>
> However, the main issue is whether we add explicit runtime dependency
> to EJML's artefact.  IIUC, the consequences are:
>  * Requirement to support it for as long as we don't change major version.
>  * Risk of JAR hell.
>
> Alternative is:
>  * Create custom interface(s) for linear algebra (to be currently
> implemented
> by an *internal* wrapper around the EJML functionalities).
>  * Use the shade plugin so that the dependency is compile-time only.
>
> Comments, preferences, other suggestions?
>
> Thanks,
> Gilles
>
> > But I also expect there to be perhaps large differences in the port due
> to Streams….
> >
> > Cheers,
> > -Ben
> >
> > From: Gilles Sadowski
> > Sent: Friday, June 21, 2019 8:18 PM
> > To: Commons Developers List
> > Subject: Re: [GSoC][Commons][STATISTICS][Regression][Matrix] Separate
> modulefor StatisticsMatrix (simple extension of EJML's SimpleBase) in
> commonsstatistics?
> >
> > Hi.
> >
> > Le ven. 21 juin 2019 à 14:38, Ben Nguyen <[hidden email]> a
> écrit :
> > >
> > > Hello,
> > >
> > > Mr. Gilles Sadowski suggested to me on Slack that StatisticsMatrix and
> future extensions of EJML’s code should go into it’s own component.
> >
> > Not exactly; I suggested that
> > 1. there be an interface defined in [Statistics] for matrix that would
> > shield its API
> > from a future change of its implementation. [Now it can be a subclass of
> EJML,
> > but what if we want to change later?  Do we want to support an external
> API
> > even when it's not used to perform the computations?]
> > 2. utilities (like the matrix interface) that can be used by several
> modules
> > of [Statistics] are best defined in a separate (maven) module.
> >
> > > So based on my understanding; should there be a general matrix module
> to use inside of commons statistics which uses the EJML?
> >
> > Which matrix functionalities are needed for the "regression" module?
> >
> > > Does anyone think another statistics component (besides regression)
> will need matrices and it’s operations?
> >
> > You could get the answer by looking at the [Math] codes.
> >
> > Regards,
> > Gilles
> >
> > >
> > > Thank you for your input,
> > > Cheers,
> > > -Ben Nguyen
> > >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [All][STATISTICS] External dependency for linear algebra?

Gilles Sadowski-2
Hi Gary.

Le sam. 22 juin 2019 à 16:04, Gary Gregory <[hidden email]> a écrit :
>
> My two bits:
> - What is the license of the third party artifact under consideration?

https://github.com/lessthanoptimal/ejml

states (bottom of page) ASL v2.0

> - Shading introduces more problems than it is worth and forces bloating.

I've no experience with shading but my assumption was that only
the code being actually used was copied (?).

> Use a normal Maven dependency

I'm quite surprised by this suggestion of yours since it would mean
that users of "Commons Statistics" could be thrown into JAR hell.

Do you agree that it could happen, but don't care (anymore!),
or do I miss something?

Regards,
Gilles

>
> Gary
>
> On Sat, Jun 22, 2019 at 9:56 AM Gilles Sadowski <[hidden email]>
> wrote:
>
> > Hello.
> >
> > [I've changed the subject line to reflect that we are discussing
> > something at the the project's policy level (not just [Statistics]).]
> >
> > Le sam. 22 juin 2019 à 05:16, Ben Nguyen <[hidden email]> a écrit :
> > >
> > > Hi,
> > > The CM regression module uses LU & QR decomposition and basic matrix
> > operations like multiply, add/subtract, transpose, inverse, as well as
> > functionalities like getting a submatrix, getting dimensions etc…. All of
> > which EJML provides as far as I’ve looked.
> >
> > That's fine that EJML is a suitable candidate; however it would be nice
> > to record somewhere why it is currently the best choice.  [It could just be
> > a recommendation from people who've used been using it rather than the
> > contenders, but it should be formally agreed on for *some* reason.]
> >
> > However, the main issue is whether we add explicit runtime dependency
> > to EJML's artefact.  IIUC, the consequences are:
> >  * Requirement to support it for as long as we don't change major version.
> >  * Risk of JAR hell.
> >
> > Alternative is:
> >  * Create custom interface(s) for linear algebra (to be currently
> > implemented
> > by an *internal* wrapper around the EJML functionalities).
> >  * Use the shade plugin so that the dependency is compile-time only.
> >
> > Comments, preferences, other suggestions?
> >
> > Thanks,
> > Gilles
> >
> > > But I also expect there to be perhaps large differences in the port due
> > to Streams….
> > >
> > > Cheers,
> > > -Ben
> > >
> > > From: Gilles Sadowski
> > > Sent: Friday, June 21, 2019 8:18 PM
> > > To: Commons Developers List
> > > Subject: Re: [GSoC][Commons][STATISTICS][Regression][Matrix] Separate
> > modulefor StatisticsMatrix (simple extension of EJML's SimpleBase) in
> > commonsstatistics?
> > >
> > > Hi.
> > >
> > > Le ven. 21 juin 2019 à 14:38, Ben Nguyen <[hidden email]> a
> > écrit :
> > > >
> > > > Hello,
> > > >
> > > > Mr. Gilles Sadowski suggested to me on Slack that StatisticsMatrix and
> > future extensions of EJML’s code should go into it’s own component.
> > >
> > > Not exactly; I suggested that
> > > 1. there be an interface defined in [Statistics] for matrix that would
> > > shield its API
> > > from a future change of its implementation. [Now it can be a subclass of
> > EJML,
> > > but what if we want to change later?  Do we want to support an external
> > API
> > > even when it's not used to perform the computations?]
> > > 2. utilities (like the matrix interface) that can be used by several
> > modules
> > > of [Statistics] are best defined in a separate (maven) module.
> > >
> > > > So based on my understanding; should there be a general matrix module
> > to use inside of commons statistics which uses the EJML?
> > >
> > > Which matrix functionalities are needed for the "regression" module?
> > >
> > > > Does anyone think another statistics component (besides regression)
> > will need matrices and it’s operations?
> > >
> > > You could get the answer by looking at the [Math] codes.
> > >
> > > Regards,
> > > Gilles
> > >
> > > >
> > > > Thank you for your input,
> > > > Cheers,
> > > > -Ben Nguyen
> > > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [All][STATISTICS] External dependency for linear algebra?

Alex Herbert


> On 22 Jun 2019, at 15:28, Gilles Sadowski <[hidden email]> wrote:
>
> Hi Gary.
>
> Le sam. 22 juin 2019 à 16:04, Gary Gregory <[hidden email] <mailto:[hidden email]>> a écrit :
>>
>> My two bits:
>> - What is the license of the third party artifact under consideration?
>
> https://github.com/lessthanoptimal/ejml <https://github.com/lessthanoptimal/ejml>
>
> states (bottom of page) ASL v2.0
>
>> - Shading introduces more problems than it is worth and forces bloating.
>
> I've no experience with shading but my assumption was that only
> the code being actually used was copied (?).
>
>> Use a normal Maven dependency
>
> I'm quite surprised by this suggestion of yours since it would mean
> that users of "Commons Statistics" could be thrown into JAR hell.
>
> Do you agree that it could happen, but don't care (anymore!),
> or do I miss something?

I use EJML v0.24. The reason I am still on that version and not the current version of 0.38 is due to a change in the API to rename the matrix classes. They did provide an upgrade script to update to 0.31 which was when the rename occurred [1] (I’ve not tried an upgrade). This occurred without a major version change so the policy on version numbers is unclear. Including a Maven dependency to something with big API changes in its timeline would be a jar problem for someone.

[1] https://github.com/lessthanoptimal/ejml/blob/master/convert_to_ejml31.py <https://github.com/lessthanoptimal/ejml/blob/master/convert_to_ejml31.py>

>
> Regards,
> Gilles
>
>>
>> Gary
>>
>> On Sat, Jun 22, 2019 at 9:56 AM Gilles Sadowski <[hidden email]>
>> wrote:
>>
>>> Hello.
>>>
>>> [I've changed the subject line to reflect that we are discussing
>>> something at the the project's policy level (not just [Statistics]).]
>>>
>>> Le sam. 22 juin 2019 à 05:16, Ben Nguyen <[hidden email]> a écrit :
>>>>
>>>> Hi,
>>>> The CM regression module uses LU & QR decomposition and basic matrix
>>> operations like multiply, add/subtract, transpose, inverse, as well as
>>> functionalities like getting a submatrix, getting dimensions etc…. All of
>>> which EJML provides as far as I’ve looked.
>>>
>>> That's fine that EJML is a suitable candidate; however it would be nice
>>> to record somewhere why it is currently the best choice.  [It could just be
>>> a recommendation from people who've used been using it rather than the
>>> contenders, but it should be formally agreed on for *some* reason.]
>>>
>>> However, the main issue is whether we add explicit runtime dependency
>>> to EJML's artefact.  IIUC, the consequences are:
>>> * Requirement to support it for as long as we don't change major version.
>>> * Risk of JAR hell.
>>>
>>> Alternative is:
>>> * Create custom interface(s) for linear algebra (to be currently
>>> implemented
>>> by an *internal* wrapper around the EJML functionalities).
>>> * Use the shade plugin so that the dependency is compile-time only.
>>>
>>> Comments, preferences, other suggestions?
>>>
>>> Thanks,
>>> Gilles
>>>
>>>> But I also expect there to be perhaps large differences in the port due
>>> to Streams….
>>>>
>>>> Cheers,
>>>> -Ben
>>>>
>>>> From: Gilles Sadowski
>>>> Sent: Friday, June 21, 2019 8:18 PM
>>>> To: Commons Developers List
>>>> Subject: Re: [GSoC][Commons][STATISTICS][Regression][Matrix] Separate
>>> modulefor StatisticsMatrix (simple extension of EJML's SimpleBase) in
>>> commonsstatistics?
>>>>
>>>> Hi.
>>>>
>>>> Le ven. 21 juin 2019 à 14:38, Ben Nguyen <[hidden email]> a
>>> écrit :
>>>>>
>>>>> Hello,
>>>>>
>>>>> Mr. Gilles Sadowski suggested to me on Slack that StatisticsMatrix and
>>> future extensions of EJML’s code should go into it’s own component.
>>>>
>>>> Not exactly; I suggested that
>>>> 1. there be an interface defined in [Statistics] for matrix that would
>>>> shield its API
>>>> from a future change of its implementation. [Now it can be a subclass of
>>> EJML,
>>>> but what if we want to change later?  Do we want to support an external
>>> API
>>>> even when it's not used to perform the computations?]
>>>> 2. utilities (like the matrix interface) that can be used by several
>>> modules
>>>> of [Statistics] are best defined in a separate (maven) module.
>>>>
>>>>> So based on my understanding; should there be a general matrix module
>>> to use inside of commons statistics which uses the EJML?
>>>>
>>>> Which matrix functionalities are needed for the "regression" module?
>>>>
>>>>> Does anyone think another statistics component (besides regression)
>>> will need matrices and it’s operations?
>>>>
>>>> You could get the answer by looking at the [Math] codes.
>>>>
>>>> Regards,
>>>> Gilles
>>>>
>>>>>
>>>>> Thank you for your input,
>>>>> Cheers,
>>>>> -Ben Nguyen
>>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email] <mailto:[hidden email]>
> For additional commands, e-mail: [hidden email] <mailto:[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: [All][STATISTICS] External dependency for linear algebra?

Gilles Sadowski-2
In reply to this post by Gilles Sadowski-2
Le sam. 22 juin 2019 à 16:28, Gilles Sadowski <[hidden email]> a écrit :

>
> Hi Gary.
>
> Le sam. 22 juin 2019 à 16:04, Gary Gregory <[hidden email]> a écrit :
> >
> > My two bits:
> > - What is the license of the third party artifact under consideration?
>
> https://github.com/lessthanoptimal/ejml
>
> states (bottom of page) ASL v2.0
>
> > - Shading introduces more problems than it is worth and forces bloating.
>
> I've no experience with shading but my assumption was that only
> the code being actually used was copied (?).
>
> > Use a normal Maven dependency

Also, IIUC, it contradicts the up-to-now (informal) policy of Commons to
generally not even adopt interfaces defined in other Commons components:
Case that comes to mind is the not-so-old discussion about random Strings
in "Commons Text" where that component defined its own interface rather
than depend on the "client-api" module of  "Commons RNG".

Gilles

>
> I'm quite surprised by this suggestion of yours since it would mean
> that users of "Commons Statistics" could be thrown into JAR hell.
>
> Do you agree that it could happen, but don't care (anymore!),
> or do I miss something?
>
> Regards,
> Gilles
>
> >
> > Gary
> >
> > On Sat, Jun 22, 2019 at 9:56 AM Gilles Sadowski <[hidden email]>
> > wrote:
> >
> > > Hello.
> > >
> > > [I've changed the subject line to reflect that we are discussing
> > > something at the the project's policy level (not just [Statistics]).]
> > >
> > > Le sam. 22 juin 2019 à 05:16, Ben Nguyen <[hidden email]> a écrit :
> > > >
> > > > Hi,
> > > > The CM regression module uses LU & QR decomposition and basic matrix
> > > operations like multiply, add/subtract, transpose, inverse, as well as
> > > functionalities like getting a submatrix, getting dimensions etc…. All of
> > > which EJML provides as far as I’ve looked.
> > >
> > > That's fine that EJML is a suitable candidate; however it would be nice
> > > to record somewhere why it is currently the best choice.  [It could just be
> > > a recommendation from people who've used been using it rather than the
> > > contenders, but it should be formally agreed on for *some* reason.]
> > >
> > > However, the main issue is whether we add explicit runtime dependency
> > > to EJML's artefact.  IIUC, the consequences are:
> > >  * Requirement to support it for as long as we don't change major version.
> > >  * Risk of JAR hell.
> > >
> > > Alternative is:
> > >  * Create custom interface(s) for linear algebra (to be currently
> > > implemented
> > > by an *internal* wrapper around the EJML functionalities).
> > >  * Use the shade plugin so that the dependency is compile-time only.
> > >
> > > Comments, preferences, other suggestions?
> > >
> > > Thanks,
> > > Gilles
> > >
> > > > But I also expect there to be perhaps large differences in the port due
> > > to Streams….
> > > >
> > > > Cheers,
> > > > -Ben
> > > >
> > > > From: Gilles Sadowski
> > > > Sent: Friday, June 21, 2019 8:18 PM
> > > > To: Commons Developers List
> > > > Subject: Re: [GSoC][Commons][STATISTICS][Regression][Matrix] Separate
> > > modulefor StatisticsMatrix (simple extension of EJML's SimpleBase) in
> > > commonsstatistics?
> > > >
> > > > Hi.
> > > >
> > > > Le ven. 21 juin 2019 à 14:38, Ben Nguyen <[hidden email]> a
> > > écrit :
> > > > >
> > > > > Hello,
> > > > >
> > > > > Mr. Gilles Sadowski suggested to me on Slack that StatisticsMatrix and
> > > future extensions of EJML’s code should go into it’s own component.
> > > >
> > > > Not exactly; I suggested that
> > > > 1. there be an interface defined in [Statistics] for matrix that would
> > > > shield its API
> > > > from a future change of its implementation. [Now it can be a subclass of
> > > EJML,
> > > > but what if we want to change later?  Do we want to support an external
> > > API
> > > > even when it's not used to perform the computations?]
> > > > 2. utilities (like the matrix interface) that can be used by several
> > > modules
> > > > of [Statistics] are best defined in a separate (maven) module.
> > > >
> > > > > So based on my understanding; should there be a general matrix module
> > > to use inside of commons statistics which uses the EJML?
> > > >
> > > > Which matrix functionalities are needed for the "regression" module?
> > > >
> > > > > Does anyone think another statistics component (besides regression)
> > > will need matrices and it’s operations?
> > > >
> > > > You could get the answer by looking at the [Math] codes.
> > > >
> > > > Regards,
> > > > Gilles
> > > >
> > > > >
> > > > > Thank you for your input,
> > > > > Cheers,
> > > > > -Ben Nguyen
> > > > >

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [All][STATISTICS] External dependency for linear algebra?

Gilles Sadowski-2
In reply to this post by Alex Herbert
Le sam. 22 juin 2019 à 19:19, Alex Herbert <[hidden email]> a écrit :

>
>
>
> > On 22 Jun 2019, at 15:28, Gilles Sadowski <[hidden email]> wrote:
> >
> > Hi Gary.
> >
> > Le sam. 22 juin 2019 à 16:04, Gary Gregory <[hidden email] <mailto:[hidden email]>> a écrit :
> >>
> >> My two bits:
> >> - What is the license of the third party artifact under consideration?
> >
> > https://github.com/lessthanoptimal/ejml <https://github.com/lessthanoptimal/ejml>
> >
> > states (bottom of page) ASL v2.0
> >
> >> - Shading introduces more problems than it is worth and forces bloating.
> >
> > I've no experience with shading but my assumption was that only
> > the code being actually used was copied (?).
> >
> >> Use a normal Maven dependency
> >
> > I'm quite surprised by this suggestion of yours since it would mean
> > that users of "Commons Statistics" could be thrown into JAR hell.
> >
> > Do you agree that it could happen, but don't care (anymore!),
> > or do I miss something?
>
> I use EJML v0.24. The reason I am still on that version and not the current version of 0.38 is due to a change in the API to rename the matrix classes. They did provide an upgrade script to update to 0.31 which was when the rename occurred [1] (I’ve not tried an upgrade). This occurred without a major version change so the policy on version numbers is unclear. Including a Maven dependency to something with big API changes in its timeline would be a jar problem for someone.

Perhaps they consider that anything is allowed in versions 0.x (?).
[I've asked the same for new "Commons" components, but IIRC,
there is still no definite answer.]

In any case, as you point out, if [Statistics] explicitly depends on
EJML, all bets are off:  Even if they don't change API, they could
modify implementation details so that applications could have different
behaviours, depending on which version is ultimately loaded by the
JVM (as per "JAR hell").

I was pretty sure that it would have been a "no-no" for a "Commons"
release.  But now I'm confused. :-}

Gilles

> [1] https://github.com/lessthanoptimal/ejml/blob/master/convert_to_ejml31.py <https://github.com/lessthanoptimal/ejml/blob/master/convert_to_ejml31.py>
> >
> > Regards,
> > Gilles
> >
> >>
> >> Gary
> >>
> >> On Sat, Jun 22, 2019 at 9:56 AM Gilles Sadowski <[hidden email]>
> >> wrote:
> >>
> >>> Hello.
> >>>
> >>> [I've changed the subject line to reflect that we are discussing
> >>> something at the the project's policy level (not just [Statistics]).]
> >>>
> >>> Le sam. 22 juin 2019 à 05:16, Ben Nguyen <[hidden email]> a écrit :
> >>>>
> >>>> Hi,
> >>>> The CM regression module uses LU & QR decomposition and basic matrix
> >>> operations like multiply, add/subtract, transpose, inverse, as well as
> >>> functionalities like getting a submatrix, getting dimensions etc…. All of
> >>> which EJML provides as far as I’ve looked.
> >>>
> >>> That's fine that EJML is a suitable candidate; however it would be nice
> >>> to record somewhere why it is currently the best choice.  [It could just be
> >>> a recommendation from people who've used been using it rather than the
> >>> contenders, but it should be formally agreed on for *some* reason.]
> >>>
> >>> However, the main issue is whether we add explicit runtime dependency
> >>> to EJML's artefact.  IIUC, the consequences are:
> >>> * Requirement to support it for as long as we don't change major version.
> >>> * Risk of JAR hell.
> >>>
> >>> Alternative is:
> >>> * Create custom interface(s) for linear algebra (to be currently
> >>> implemented
> >>> by an *internal* wrapper around the EJML functionalities).
> >>> * Use the shade plugin so that the dependency is compile-time only.
> >>>
> >>> Comments, preferences, other suggestions?
> >>>
> >>> Thanks,
> >>> Gilles
> >>>
> >>>> But I also expect there to be perhaps large differences in the port due
> >>> to Streams….
> >>>>
> >>>> Cheers,
> >>>> -Ben
> >>>>
> >>>> From: Gilles Sadowski
> >>>> Sent: Friday, June 21, 2019 8:18 PM
> >>>> To: Commons Developers List
> >>>> Subject: Re: [GSoC][Commons][STATISTICS][Regression][Matrix] Separate
> >>> modulefor StatisticsMatrix (simple extension of EJML's SimpleBase) in
> >>> commonsstatistics?
> >>>>
> >>>> Hi.
> >>>>
> >>>> Le ven. 21 juin 2019 à 14:38, Ben Nguyen <[hidden email]> a
> >>> écrit :
> >>>>>
> >>>>> Hello,
> >>>>>
> >>>>> Mr. Gilles Sadowski suggested to me on Slack that StatisticsMatrix and
> >>> future extensions of EJML’s code should go into it’s own component.
> >>>>
> >>>> Not exactly; I suggested that
> >>>> 1. there be an interface defined in [Statistics] for matrix that would
> >>>> shield its API
> >>>> from a future change of its implementation. [Now it can be a subclass of
> >>> EJML,
> >>>> but what if we want to change later?  Do we want to support an external
> >>> API
> >>>> even when it's not used to perform the computations?]
> >>>> 2. utilities (like the matrix interface) that can be used by several
> >>> modules
> >>>> of [Statistics] are best defined in a separate (maven) module.
> >>>>
> >>>>> So based on my understanding; should there be a general matrix module
> >>> to use inside of commons statistics which uses the EJML?
> >>>>
> >>>> Which matrix functionalities are needed for the "regression" module?
> >>>>
> >>>>> Does anyone think another statistics component (besides regression)
> >>> will need matrices and it’s operations?
> >>>>
> >>>> You could get the answer by looking at the [Math] codes.
> >>>>
> >>>> Regards,
> >>>> Gilles
> >>>>
> >>>>>
> >>>>> Thank you for your input,
> >>>>> Cheers,
> >>>>> -Ben Nguyen
> >>>>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: [hidden email]
> >>> For additional commands, e-mail: [hidden email]
> >>>
> >>>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email] <mailto:[hidden email]>
> > For additional commands, e-mail: [hidden email] <mailto:[hidden email]>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [All][STATISTICS] External dependency for linear algebra?

Rob Tompkins
Have we tried asking if he wants to be a part of commons? Seems like that library could be a good fit and it might help him out in the long run.

-Rob

> On Jun 22, 2019, at 1:37 PM, Gilles Sadowski <[hidden email]> wrote:
>
> Le sam. 22 juin 2019 à 19:19, Alex Herbert <[hidden email] <mailto:[hidden email]>> a écrit :
>>
>>
>>
>>> On 22 Jun 2019, at 15:28, Gilles Sadowski <[hidden email]> wrote:
>>>
>>> Hi Gary.
>>>
>>> Le sam. 22 juin 2019 à 16:04, Gary Gregory <[hidden email] <mailto:[hidden email]>> a écrit :
>>>>
>>>> My two bits:
>>>> - What is the license of the third party artifact under consideration?
>>>
>>> https://github.com/lessthanoptimal/ejml <https://github.com/lessthanoptimal/ejml>
>>>
>>> states (bottom of page) ASL v2.0
>>>
>>>> - Shading introduces more problems than it is worth and forces bloating.
>>>
>>> I've no experience with shading but my assumption was that only
>>> the code being actually used was copied (?).
>>>
>>>> Use a normal Maven dependency
>>>
>>> I'm quite surprised by this suggestion of yours since it would mean
>>> that users of "Commons Statistics" could be thrown into JAR hell.
>>>
>>> Do you agree that it could happen, but don't care (anymore!),
>>> or do I miss something?
>>
>> I use EJML v0.24. The reason I am still on that version and not the current version of 0.38 is due to a change in the API to rename the matrix classes. They did provide an upgrade script to update to 0.31 which was when the rename occurred [1] (I’ve not tried an upgrade). This occurred without a major version change so the policy on version numbers is unclear. Including a Maven dependency to something with big API changes in its timeline would be a jar problem for someone.
>
> Perhaps they consider that anything is allowed in versions 0.x (?).
> [I've asked the same for new "Commons" components, but IIRC,
> there is still no definite answer.]
>
> In any case, as you point out, if [Statistics] explicitly depends on
> EJML, all bets are off:  Even if they don't change API, they could
> modify implementation details so that applications could have different
> behaviours, depending on which version is ultimately loaded by the
> JVM (as per "JAR hell").
>
> I was pretty sure that it would have been a "no-no" for a "Commons"
> release.  But now I'm confused. :-}
>
> Gilles
>
>> [1] https://github.com/lessthanoptimal/ejml/blob/master/convert_to_ejml31.py <https://github.com/lessthanoptimal/ejml/blob/master/convert_to_ejml31.py>
>>>
>>> Regards,
>>> Gilles
>>>
>>>>
>>>> Gary
>>>>
>>>> On Sat, Jun 22, 2019 at 9:56 AM Gilles Sadowski <[hidden email]>
>>>> wrote:
>>>>
>>>>> Hello.
>>>>>
>>>>> [I've changed the subject line to reflect that we are discussing
>>>>> something at the the project's policy level (not just [Statistics]).]
>>>>>
>>>>> Le sam. 22 juin 2019 à 05:16, Ben Nguyen <[hidden email]> a écrit :
>>>>>>
>>>>>> Hi,
>>>>>> The CM regression module uses LU & QR decomposition and basic matrix
>>>>> operations like multiply, add/subtract, transpose, inverse, as well as
>>>>> functionalities like getting a submatrix, getting dimensions etc…. All of
>>>>> which EJML provides as far as I’ve looked.
>>>>>
>>>>> That's fine that EJML is a suitable candidate; however it would be nice
>>>>> to record somewhere why it is currently the best choice.  [It could just be
>>>>> a recommendation from people who've used been using it rather than the
>>>>> contenders, but it should be formally agreed on for *some* reason.]
>>>>>
>>>>> However, the main issue is whether we add explicit runtime dependency
>>>>> to EJML's artefact.  IIUC, the consequences are:
>>>>> * Requirement to support it for as long as we don't change major version.
>>>>> * Risk of JAR hell.
>>>>>
>>>>> Alternative is:
>>>>> * Create custom interface(s) for linear algebra (to be currently
>>>>> implemented
>>>>> by an *internal* wrapper around the EJML functionalities).
>>>>> * Use the shade plugin so that the dependency is compile-time only.
>>>>>
>>>>> Comments, preferences, other suggestions?
>>>>>
>>>>> Thanks,
>>>>> Gilles
>>>>>
>>>>>> But I also expect there to be perhaps large differences in the port due
>>>>> to Streams….
>>>>>>
>>>>>> Cheers,
>>>>>> -Ben
>>>>>>
>>>>>> From: Gilles Sadowski
>>>>>> Sent: Friday, June 21, 2019 8:18 PM
>>>>>> To: Commons Developers List
>>>>>> Subject: Re: [GSoC][Commons][STATISTICS][Regression][Matrix] Separate
>>>>> modulefor StatisticsMatrix (simple extension of EJML's SimpleBase) in
>>>>> commonsstatistics?
>>>>>>
>>>>>> Hi.
>>>>>>
>>>>>> Le ven. 21 juin 2019 à 14:38, Ben Nguyen <[hidden email]> a
>>>>> écrit :
>>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> Mr. Gilles Sadowski suggested to me on Slack that StatisticsMatrix and
>>>>> future extensions of EJML’s code should go into it’s own component.
>>>>>>
>>>>>> Not exactly; I suggested that
>>>>>> 1. there be an interface defined in [Statistics] for matrix that would
>>>>>> shield its API
>>>>>> from a future change of its implementation. [Now it can be a subclass of
>>>>> EJML,
>>>>>> but what if we want to change later?  Do we want to support an external
>>>>> API
>>>>>> even when it's not used to perform the computations?]
>>>>>> 2. utilities (like the matrix interface) that can be used by several
>>>>> modules
>>>>>> of [Statistics] are best defined in a separate (maven) module.
>>>>>>
>>>>>>> So based on my understanding; should there be a general matrix module
>>>>> to use inside of commons statistics which uses the EJML?
>>>>>>
>>>>>> Which matrix functionalities are needed for the "regression" module?
>>>>>>
>>>>>>> Does anyone think another statistics component (besides regression)
>>>>> will need matrices and it’s operations?
>>>>>>
>>>>>> You could get the answer by looking at the [Math] codes.
>>>>>>
>>>>>> Regards,
>>>>>> Gilles
>>>>>>
>>>>>>>
>>>>>>> Thank you for your input,
>>>>>>> Cheers,
>>>>>>> -Ben Nguyen
>>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [hidden email]
>>>>> For additional commands, e-mail: [hidden email]
>>>>>
>>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email] <mailto:[hidden email]>
>>> For additional commands, e-mail: [hidden email] <mailto:[hidden email]>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email] <mailto:[hidden email]>
> For additional commands, e-mail: [hidden email] <mailto:[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: [All][STATISTICS] External dependency for linear algebra?

Gilles Sadowski-2
Hello.

Le sam. 22 juin 2019 à 20:22, Rob Tompkins <[hidden email]> a écrit :
>
> Have we tried asking if he wants to be a part of commons?

AFAIK, no.

> Seems like that library could be a good fit

+1

> and it might help him out in the long run.

That I'm not sure. ;-)

Regards,
Gilles

>
> -Rob
>
> > On Jun 22, 2019, at 1:37 PM, Gilles Sadowski <[hidden email]> wrote:
> >
> > Le sam. 22 juin 2019 à 19:19, Alex Herbert <[hidden email] <mailto:[hidden email]>> a écrit :
> >>
> >>
> >>
> >>> On 22 Jun 2019, at 15:28, Gilles Sadowski <[hidden email]> wrote:
> >>>
> >>> Hi Gary.
> >>>
> >>> Le sam. 22 juin 2019 à 16:04, Gary Gregory <[hidden email] <mailto:[hidden email]>> a écrit :
> >>>>
> >>>> My two bits:
> >>>> - What is the license of the third party artifact under consideration?
> >>>
> >>> https://github.com/lessthanoptimal/ejml <https://github.com/lessthanoptimal/ejml>
> >>>
> >>> states (bottom of page) ASL v2.0
> >>>
> >>>> - Shading introduces more problems than it is worth and forces bloating.
> >>>
> >>> I've no experience with shading but my assumption was that only
> >>> the code being actually used was copied (?).
> >>>
> >>>> Use a normal Maven dependency
> >>>
> >>> I'm quite surprised by this suggestion of yours since it would mean
> >>> that users of "Commons Statistics" could be thrown into JAR hell.
> >>>
> >>> Do you agree that it could happen, but don't care (anymore!),
> >>> or do I miss something?
> >>
> >> I use EJML v0.24. The reason I am still on that version and not the current version of 0.38 is due to a change in the API to rename the matrix classes. They did provide an upgrade script to update to 0.31 which was when the rename occurred [1] (I’ve not tried an upgrade). This occurred without a major version change so the policy on version numbers is unclear. Including a Maven dependency to something with big API changes in its timeline would be a jar problem for someone.
> >
> > Perhaps they consider that anything is allowed in versions 0.x (?).
> > [I've asked the same for new "Commons" components, but IIRC,
> > there is still no definite answer.]
> >
> > In any case, as you point out, if [Statistics] explicitly depends on
> > EJML, all bets are off:  Even if they don't change API, they could
> > modify implementation details so that applications could have different
> > behaviours, depending on which version is ultimately loaded by the
> > JVM (as per "JAR hell").
> >
> > I was pretty sure that it would have been a "no-no" for a "Commons"
> > release.  But now I'm confused. :-}
> >
> > Gilles
> >
> >> [1] https://github.com/lessthanoptimal/ejml/blob/master/convert_to_ejml31.py <https://github.com/lessthanoptimal/ejml/blob/master/convert_to_ejml31.py>
> >>>
> >>> Regards,
> >>> Gilles
> >>>
> >>>>
> >>>> Gary
> >>>>
> >>>> On Sat, Jun 22, 2019 at 9:56 AM Gilles Sadowski <[hidden email]>
> >>>> wrote:
> >>>>
> >>>>> Hello.
> >>>>>
> >>>>> [I've changed the subject line to reflect that we are discussing
> >>>>> something at the the project's policy level (not just [Statistics]).]
> >>>>>
> >>>>> Le sam. 22 juin 2019 à 05:16, Ben Nguyen <[hidden email]> a écrit :
> >>>>>>
> >>>>>> Hi,
> >>>>>> The CM regression module uses LU & QR decomposition and basic matrix
> >>>>> operations like multiply, add/subtract, transpose, inverse, as well as
> >>>>> functionalities like getting a submatrix, getting dimensions etc…. All of
> >>>>> which EJML provides as far as I’ve looked.
> >>>>>
> >>>>> That's fine that EJML is a suitable candidate; however it would be nice
> >>>>> to record somewhere why it is currently the best choice.  [It could just be
> >>>>> a recommendation from people who've used been using it rather than the
> >>>>> contenders, but it should be formally agreed on for *some* reason.]
> >>>>>
> >>>>> However, the main issue is whether we add explicit runtime dependency
> >>>>> to EJML's artefact.  IIUC, the consequences are:
> >>>>> * Requirement to support it for as long as we don't change major version.
> >>>>> * Risk of JAR hell.
> >>>>>
> >>>>> Alternative is:
> >>>>> * Create custom interface(s) for linear algebra (to be currently
> >>>>> implemented
> >>>>> by an *internal* wrapper around the EJML functionalities).
> >>>>> * Use the shade plugin so that the dependency is compile-time only.
> >>>>>
> >>>>> Comments, preferences, other suggestions?
> >>>>>
> >>>>> Thanks,
> >>>>> Gilles
> >>>>>
> >>>>>> But I also expect there to be perhaps large differences in the port due
> >>>>> to Streams….
> >>>>>>
> >>>>>> Cheers,
> >>>>>> -Ben
> >>>>>>
> >>>>>> From: Gilles Sadowski
> >>>>>> Sent: Friday, June 21, 2019 8:18 PM
> >>>>>> To: Commons Developers List
> >>>>>> Subject: Re: [GSoC][Commons][STATISTICS][Regression][Matrix] Separate
> >>>>> modulefor StatisticsMatrix (simple extension of EJML's SimpleBase) in
> >>>>> commonsstatistics?
> >>>>>>
> >>>>>> Hi.
> >>>>>>
> >>>>>> Le ven. 21 juin 2019 à 14:38, Ben Nguyen <[hidden email]> a
> >>>>> écrit :
> >>>>>>>
> >>>>>>> Hello,
> >>>>>>>
> >>>>>>> Mr. Gilles Sadowski suggested to me on Slack that StatisticsMatrix and
> >>>>> future extensions of EJML’s code should go into it’s own component.
> >>>>>>
> >>>>>> Not exactly; I suggested that
> >>>>>> 1. there be an interface defined in [Statistics] for matrix that would
> >>>>>> shield its API
> >>>>>> from a future change of its implementation. [Now it can be a subclass of
> >>>>> EJML,
> >>>>>> but what if we want to change later?  Do we want to support an external
> >>>>> API
> >>>>>> even when it's not used to perform the computations?]
> >>>>>> 2. utilities (like the matrix interface) that can be used by several
> >>>>> modules
> >>>>>> of [Statistics] are best defined in a separate (maven) module.
> >>>>>>
> >>>>>>> So based on my understanding; should there be a general matrix module
> >>>>> to use inside of commons statistics which uses the EJML?
> >>>>>>
> >>>>>> Which matrix functionalities are needed for the "regression" module?
> >>>>>>
> >>>>>>> Does anyone think another statistics component (besides regression)
> >>>>> will need matrices and it’s operations?
> >>>>>>
> >>>>>> You could get the answer by looking at the [Math] codes.
> >>>>>>
> >>>>>> Regards,
> >>>>>> Gilles
> >>>>>>
> >>>>>>>
> >>>>>>> Thank you for your input,
> >>>>>>> Cheers,
> >>>>>>> -Ben Nguyen
> >>>>>>>
> >>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: [hidden email]
> >>>>> For additional commands, e-mail: [hidden email]
> >>>>>
> >>>>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: [hidden email] <mailto:[hidden email]>
> >>> For additional commands, e-mail: [hidden email] <mailto:[hidden email]>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email] <mailto:[hidden email]>
> > For additional commands, e-mail: [hidden email] <mailto:[hidden email]>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[All] Actively seek contributor? [Was: External dependency for linear algebra?]

Gilles Sadowski-2
Hi.

Thanks for the suggestion, Rob.

Should we contact him?  [Perhaps he reads this ML...]
   https://www.linkedin.com/in/peter-abeles-59b2603
   https://github.com/lessthanoptimal/

In addition to EJML, it seems that there could be a nice consolidation
with [Geometry]:
   https://github.com/lessthanoptimal/GeoRegression

Gilles

Le lun. 24 juin 2019 à 02:24, Gilles Sadowski <[hidden email]> a écrit :

>
> Hello.
>
> Le sam. 22 juin 2019 à 20:22, Rob Tompkins <[hidden email]> a écrit :
> >
> > Have we tried asking if he wants to be a part of commons?
>
> AFAIK, no.
>
> > Seems like that library could be a good fit
>
>>> [...]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [All] Actively seek contributor? [Was: External dependency for linear algebra?]

Eric Barnhill
+1

On Tue, Jun 25, 2019 at 6:24 PM Gilles Sadowski <[hidden email]>
wrote:

> Hi.
>
> Thanks for the suggestion, Rob.
>
> Should we contact him?  [Perhaps he reads this ML...]
>    https://www.linkedin.com/in/peter-abeles-59b2603
>    https://github.com/lessthanoptimal/
>
> In addition to EJML, it seems that there could be a nice consolidation
> with [Geometry]:
>    https://github.com/lessthanoptimal/GeoRegression
>
> Gilles
>
> Le lun. 24 juin 2019 à 02:24, Gilles Sadowski <[hidden email]> a
> écrit :
> >
> > Hello.
> >
> > Le sam. 22 juin 2019 à 20:22, Rob Tompkins <[hidden email]> a écrit
> :
> > >
> > > Have we tried asking if he wants to be a part of commons?
> >
> > AFAIK, no.
> >
> > > Seems like that library could be a good fit
> >
> >>> [...]
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>