
12345

Dear all,
I am a maintainer of the matrixtoolkitsjava
http://code.google.com/p/matrixtoolkitsjava/which is a comprehensive collection of matrix data structures, linear solvers, least squares methods, eigenvalue and singular value decompositions.
This note is in regard to the commonsmath library. It is clear that our projects dovetail, especially when I look at "linear" in version 2.0 of the API. It would be good if we could either complement or consolidate efforts, rather than reproduce.
It would be excellent if all the functionality of matrixtoolkitsjava were available as part of commonsmath. There is already too much diversity and unmaintained maths code out there for Java!
As a start, I'd like to discourage the use of a solid implementation for SparseReal{Vector, Matrix}... please prefer an interface approach, allowing implementations based on the Templates project:
http://www.netlib.org/templatesThe reason is that the storage implementation should be related to the type of data being stored. For example, there are many wellknown kinds of sparse matrix that are well suited to particular kinds of calculations... consider multiplying sparse matrices that you know to be diagonal!
In general, the netlib.org folk (BLAS/LAPACK) have spent a *lot* of time thinking about linear algebra and have set up unrivalled standard APIs which have been implemented right down to the architecture level. It would be a major mistake if commonsmath didn't build on their good work.
I believe commonsmath should move to a netlibjava backend (allowing the use of machine optimised BLAS/LAPACK).
http://code.google.com/p/netlibjava/The largest problems facing MTJ are support for Sparse BLAS/LAPACK and scalability to parallel architectures which use Parallel BLAS/LAPACK. The former should be possible with some work within the current API, but I fear major API changes would be needed for the latter. I do not want the commonsmath API to walk into this trap without having first considered future architectures! MTJ has a distributed package, but I am not sure if this is something that is completely future proof either.
What say ye'?

Sam


On Thu, May 14, 2009 at 3:18 AM, Sam Halliday < [hidden email]>wrote:
>
> I am a maintainer of the matrixtoolkitsjava
Which is an impressive piece of work, especially the transparent but
nonbinding interface to the Atlas and Blas native packages. My compliments
to BjørnOve and all who have followed up on his original work.
This note is in regard to the commonsmath library. It is clear that our
> projects dovetail, especially when I look at "linear" in version 2.0 of the
> API. It would be good if we could either complement or consolidate efforts,
> rather than reproduce.
That sounds good to me.
As a start, I'd like to discourage the use of a solid implementation for
> SparseReal{Vector, Matrix}... please prefer an interface approach, allowing
> implementations based on the Templates project:
Can you say more about what aspects of the Templates project you feel are
important? You mention one case of storage layout.
> I believe commonsmath should move to a netlibjava backend (allowing the
> use of machine optimised BLAS/LAPACK).
This is an interesting suggestion. Obviously adopting MTJ wholesale would
accomplish that.
Can you say something about the licensing issues if we were to explore, for
discussion sake, MTJ being folded into commonsmath? MTJ is LGPL while
commons has to stay Apache licensed. This licensing issue has been the
biggest sticking point in the past.


Ted Dunning a écrit :
> On Thu, May 14, 2009 at 3:18 AM, Sam Halliday < [hidden email]>wrote:
>
>> I am a maintainer of the matrixtoolkitsjava
>
>
> Which is an impressive piece of work, especially the transparent but
> nonbinding interface to the Atlas and Blas native packages. My compliments
> to BjørnOve and all who have followed up on his original work.
>
> This note is in regard to the commonsmath library. It is clear that our
>> projects dovetail, especially when I look at "linear" in version 2.0 of the
>> API. It would be good if we could either complement or consolidate efforts,
>> rather than reproduce.
>
>
> That sounds good to me.
You are right.
>
> As a start, I'd like to discourage the use of a solid implementation for
>> SparseReal{Vector, Matrix}... please prefer an interface approach, allowing
>> implementations based on the Templates project:
This is exactly the purpose of the RealMatrix/RealVector and FieldMatrix
/FieldVector interfaces on one side and of the CholeskiDecomposition,
EigenDecomposition, LUDecomposition, QRDecomposition,
SingularValueDecomposition and Decompositionsolver interfaces on the
other side.
RealMatrix (resp. FieldMatrix) is the top level interfaces that does not
mandate any specific storage. It has several implementations
(RealMatrixImpl with a simple double[][] array, DenseRealMatrix with a
block implementation, SparseRealMatrix). We also have in mind (not
implemented yet) things like DiagonalRealMatrix or BandRealMatrix or
Lower/UpperTriangularRealMatrix. All implementations can be mixed
together and specific cases are automatically detected and handled to
avoid using the general embedded loops and use smart algorithms when
possible. There was also an attempt using recursive layouts (with
GrayMorton space filling curve).
Maybe SparseRealMatrix was a bad name and should have been
SimpleSparseRealMatrix to avoid confusion with other sparse storage and
dedicated algorithms.
>
>
> Can you say more about what aspects of the Templates project you feel are
> important? You mention one case of storage layout.
>
>
>> I believe commonsmath should move to a netlibjava backend (allowing the
>> use of machine optimised BLAS/LAPACK).
>
>
> This is an interesting suggestion. Obviously adopting MTJ wholesale would
> accomplish that.
>
> Can you say something about the licensing issues if we were to explore, for
> discussion sake, MTJ being folded into commonsmath? MTJ is LGPL while
> commons has to stay Apache licensed. This licensing issue has been the
> biggest sticking point in the past.
This is really an issue. Apache projects cannot use LGPL (or GPL) code.
See http://www.apache.org/legal/resolved.html for the policy.
[math] also has currently zero dependencies. We had two dependencies on
other commons components up to version 1.2 and removed them when we
started work on version 2.0. Adding new dependencies, and especially
dependencies that involve native libraries is a difficult decision that
needs lots of discussion. We are currently trying to have 2.0 published
very soon now, such a decision would delay the publication several months.
Some benchmarks I did a few weeks ago showed the new [math] linear
package implementation was quite fast and compared very well with native
fortran libraries for QR decomposition with similar nonblocked
algorithms. In fact, it was 7 times faster than unoptimized numerical
recipes, about 2 or 3 times faster than optimized numerical recipes, and
very slightly (a few percents) faster than optimized lapack with Atlas
as the BLAS implementation. Faster QR decomposition required changing
the algorithm so the blocked lapack implementation was the only native
implementation faster than [math]. Of course, I now do want to also
implement a blocked QR decomposition in [math] ...
I am aware that we still lack lots of very efficient linear algebra
algorithms. Joining efforts with you would be a real gain if we can
solve the licensing issues and avoid new dependencies if possible.
[math] has already adopted a lot of external code, even complete
libraries. I came in by donating the whole mantissa library, merging it
in and now contributing to the maintainance of the component with the
other developers.
Luc
>

To unsubscribe, email: [hidden email]
For additional commands, email: [hidden email]


Dang.
That is fast.
What size matrices was this for?
On Thu, May 14, 2009 at 12:09 PM, Luc Maisonobe < [hidden email]>wrote:
> Some benchmarks I did a few weeks ago showed the new [math] linear
> package implementation was quite fast and compared very well with native
> fortran libraries for QR decomposition with similar nonblocked
> algorithms. In fact, it was 7 times faster than unoptimized numerical
> recipes, about 2 or 3 times faster than optimized numerical recipes, and
> very slightly (a few percents) faster than optimized lapack with Atlas
> as the BLAS implementation. Faster QR decomposition required changing
> the algorithm so the blocked lapack implementation was the only native
> implementation faster than [math].
>


Ted Dunning a écrit :
> Dang.
>
> That is fast.
>
> What size matrices was this for?
>
> On Thu, May 14, 2009 at 12:09 PM, Luc Maisonobe < [hidden email]>wrote:
>
>> Some benchmarks I did a few weeks ago showed the new [math] linear
>> package implementation was quite fast and compared very well with native
>> fortran libraries for QR decomposition with similar nonblocked
>> algorithms. In fact, it was 7 times faster than unoptimized numerical
>> recipes, about 2 or 3 times faster than optimized numerical recipes, and
>> very slightly (a few percents) faster than optimized lapack with Atlas
>> as the BLAS implementation. Faster QR decomposition required changing
>> the algorithm so the blocked lapack implementation was the only native
>> implementation faster than [math].
>>
>
Medium size, up to 600 if I remember correctly (I don't have the curve
available here). The pattern with O(n^3) was clearly visible so the
result can be be extrapolated for medium size I think. I did not check
larger matrices like 3000 or more, results may be different.
Beware, that was one algorithm only: QR decomposition and on one host
only (AMD64 phenom quad core, Linux, sun Java 6 on one side, Gnu fortran
on the other side). GNU fortran is not a very fast compiler so the
result would obviously be very different with a better compiler. The
purpose of this test was not to say that Java is faster, it was to show
that the performance difference between Java and fortran was smaller
than the difference you get by changing algorithms (in this case blocked
vs. nonblocked). I was surprised by the results.
For other kind of algorithms, mixing linear algebra, trigonometric
computation, ODE integration, root searching ... I often see performance
differences of about a factor 2, which in my opinion is a factor similar
to other changes one can do (algorithms, CPU, BLAS, compiler,
parallelism, memory ...). Of course, when comparing an highly optimized
algorithm with the perfect cache size, and the best compiler to a
standard setting we can get very different factors. So do not take these
numbers for granted.
Trying to get the fastest computation is not the purpose of [math]. It
should be reasonably efficient with respect to other libraries,
including native ones, but is not dedicated to speed. It should remain a
generalpurpose library.
Luc

To unsubscribe, email: [hidden email]
For additional commands, email: [hidden email]


Replies inline:
Ted Dunning wrote
> As a start, I'd like to discourage the use of a solid implementation for
> SparseReal{Vector, Matrix}... please prefer an interface approach, allowing
> implementations based on the Templates project:
Can you say more about what aspects of the Templates project you feel are important? You mention one case of storage layout.
It's difficult to say which algorithms from Templates ones are the most important ones, but in most cases reference implementations already exist (usually in fortran) and should be preferred (e.g. by using f2j with a wrapper layer): the theory can be quite involved. MTJ only touches the surface! However, an important step is recognising that there are not just "dense" and "sparse" matrices... but whole classes of structured sparse matrices.
Ted Dunning wrote
> I believe commonsmath should move to a netlibjava backend (allowing the
> use of machine optimised BLAS/LAPACK).
This is an interesting suggestion. Obviously adopting MTJ wholesale would
accomplish that.
Can you say something about the licensing issues if we were to explore, for
discussion sake, MTJ being folded into commonsmath? MTJ is LGPL while
commons has to stay Apache licensed. This licensing issue has been the
biggest sticking point in the past.
I personally have no problems with my MTJ contributions being released Apache. BjornOve is the person to talk to about the bulk of MTJ. I'll ask him!
MTJ depends on netlibjava, which is technically a translation of the original netlib libraries. They are BSD license. I seriously doubt you'll get them to give you the right to redistribute as, so you'll have to decide if that's a blocker.
What would "adopting wholesale" mean? It would be a good opportunity to review/revise parts of the API and find duplication with the rest of the commonsmath project.


On Thu, May 14, 2009 at 1:54 PM, Sam Halliday < [hidden email]>wrote:
> I personally have no problems with my MTJ contributions being released
> Apache. BjornOve is the person to talk to about the bulk of MTJ. I'll ask
> him!
>
Great.
MTJ depends on netlibjava, which is technically a translation of the
> original netlib libraries. They are BSD license. I seriously doubt you'll
> get them to give you the right to redistribute as, so you'll have to decide
> if that's a blocker.
>
If they really are BSD, then there should be no problem. BSD allows
redistribution with attribution, preservation of the copyright notice and no
implication of endorsement.
> What would "adopting wholesale" mean? It would be a good opportunity to
> review/revise parts of the API and find duplication with the rest of the
> commonsmath project.
>
There is a big issue with dependencies, but a much smaller issue with major
source code contributions. Essentially what I mean by "adopting wholesale"
would be for commons math to ingest MTJ. IN the best world, the contributor
communities would merge as well. There would still be plenty of issues such
as the conditional dependency on native libraries. I am not sure how that
should play out.

Ted Dunning, CTO
DeepDyve
111 West Evelyn Ave. Ste. 202
Sunnyvale, CA 94086
www.deepdyve.com
8584140013 (m)
4087730220 (fax)


Luc Maisonobe wrote:
> Ted Dunning a écrit :
>
>> On Thu, May 14, 2009 at 3:18 AM, Sam Halliday < [hidden email]>wrote:
>>
>>
>>> I am a maintainer of the matrixtoolkitsjava
>>>
>> Which is an impressive piece of work, especially the transparent but
>> nonbinding interface to the Atlas and Blas native packages. My compliments
>> to BjørnOve and all who have followed up on his original work.
>>
>> This note is in regard to the commonsmath library. It is clear that our
>>
>>> projects dovetail, especially when I look at "linear" in version 2.0 of the
>>> API. It would be good if we could either complement or consolidate efforts,
>>> rather than reproduce.
>>>
>> That sounds good to me.
>>
>
> You are right.
>
>
>> As a start, I'd like to discourage the use of a solid implementation for
>>
>>> SparseReal{Vector, Matrix}... please prefer an interface approach, allowing
>>> implementations based on the Templates project:
>>>
>
> This is exactly the purpose of the RealMatrix/RealVector and FieldMatrix
> /FieldVector interfaces on one side and of the CholeskiDecomposition,
> EigenDecomposition, LUDecomposition, QRDecomposition,
> SingularValueDecomposition and Decompositionsolver interfaces on the
> other side.
>
> RealMatrix (resp. FieldMatrix) is the top level interfaces that does not
> mandate any specific storage. It has several implementations
> (RealMatrixImpl with a simple double[][] array, DenseRealMatrix with a
> block implementation, SparseRealMatrix). We also have in mind (not
> implemented yet) things like DiagonalRealMatrix or BandRealMatrix or
> Lower/UpperTriangularRealMatrix. All implementations can be mixed
> together and specific cases are automatically detected and handled to
> avoid using the general embedded loops and use smart algorithms when
> possible. There was also an attempt using recursive layouts (with
> GrayMorton space filling curve).
>
> Maybe SparseRealMatrix was a bad name and should have been
> SimpleSparseRealMatrix to avoid confusion with other sparse storage and
> dedicated algorithms.
>
>
>> Can you say more about what aspects of the Templates project you feel are
>> important? You mention one case of storage layout.
>>
>>
>>
>>> I believe commonsmath should move to a netlibjava backend (allowing the
>>> use of machine optimised BLAS/LAPACK).
>>>
>> This is an interesting suggestion. Obviously adopting MTJ wholesale would
>> accomplish that.
>>
>> Can you say something about the licensing issues if we were to explore, for
>> discussion sake, MTJ being folded into commonsmath? MTJ is LGPL while
>> commons has to stay Apache licensed. This licensing issue has been the
>> biggest sticking point in the past.
>>
>
> This is really an issue. Apache projects cannot use LGPL (or GPL) code.
> See http://www.apache.org/legal/resolved.html for the policy.
>
> [math] also has currently zero dependencies. We had two dependencies on
> other commons components up to version 1.2 and removed them when we
> started work on version 2.0. Adding new dependencies, and especially
> dependencies that involve native libraries is a difficult decision that
> needs lots of discussion. We are currently trying to have 2.0 published
> very soon now, such a decision would delay the publication several months.
>
1 for adding dependencies, especially on native code. Commons math
needs to remain
1) ASL licensed
2) selfcontained
3) fully documented, full open source
Phil
> Some benchmarks I did a few weeks ago showed the new [math] linear
> package implementation was quite fast and compared very well with native
> fortran libraries for QR decomposition with similar nonblocked
> algorithms. In fact, it was 7 times faster than unoptimized numerical
> recipes, about 2 or 3 times faster than optimized numerical recipes, and
> very slightly (a few percents) faster than optimized lapack with Atlas
> as the BLAS implementation. Faster QR decomposition required changing
> the algorithm so the blocked lapack implementation was the only native
> implementation faster than [math]. Of course, I now do want to also
> implement a blocked QR decomposition in [math] ...
>
> I am aware that we still lack lots of very efficient linear algebra
> algorithms. Joining efforts with you would be a real gain if we can
> solve the licensing issues and avoid new dependencies if possible.
> [math] has already adopted a lot of external code, even complete
> libraries. I came in by donating the whole mantissa library, merging it
> in and now contributing to the maintainance of the component with the
> other developers.
>
> Luc
>
>
>
>
> 
> To unsubscribe, email: [hidden email]
> For additional commands, email: [hidden email]
>
>

To unsubscribe, email: [hidden email]
For additional commands, email: [hidden email]


Phil, I think we have much of the same desires and motivations, but we seem
to come to somewhat, but not entirely different conclusions.
Assuming that (1) can be dealt with and assuming that (3) is already dealt
with, do you still mind the inclusion of *optional*, automatically generated
native code?
This page has some useful speed comparisons. For matrixmatrix multiply up
to size 50, java is competitive. If you get up to roughly n = 500 or 1000,
then heavily optimized native code can be up to 3x faster. Note the line in
the third graph for colt (a reasonably well written pure java
implementation) and MTJ (which is running in pure java mode here). In my
case, I will generally opt for portability, but I like to have a portable
option for speed. It is also important to remember that numerical codes
more often need blinding speed than most other applications.
http://blog.mikiobraun.de/2009/04/somebenchmarknumbersforjblas.htmlIs that optional dependency really all that bad?
On Fri, May 15, 2009 at 6:23 PM, Phil Steitz < [hidden email]> wrote:
> [math] also has currently zero dependencies. We had two dependencies on
>> other commons components up to version 1.2 and removed them when we
>> started work on version 2.0. Adding new dependencies, and especially
>> dependencies that involve native libraries is a difficult decision that
>> needs lots of discussion. We are currently trying to have 2.0 published
>> very soon now, such a decision would delay the publication several months.
>>
>>
> 1 for adding dependencies, especially on native code. Commons math needs
> to remain
>
> 1) ASL licensed
> 2) selfcontained
> 3) fully documented, full open source

Ted Dunning, CTO
DeepDyve


Ted Dunning wrote:
> Phil, I think we have much of the same desires and motivations, but we seem
> to come to somewhat, but not entirely different conclusions.
>
> Assuming that (1) can be dealt with and assuming that (3) is already dealt
> with, do you still mind the inclusion of *optional*, automatically generated
> native code?
>
Part of 3) is having full code available with the package for
inspection. That is part of the reason that we have avoided external
dependencies. I would be open to making our fully selfcontained, fully
documented, fully open source library extensible to use other libraries,
including native libraries, but I would not want to distribute anything
associated with external libraries. The reason for this is the
commitment made early on that all numerics and algorithms would be
immediately visible to the user  no chasing down external, possibly
incomplete or ambiguous docs to figure out what our code is doing.
> This page has some useful speed comparisons. For matrixmatrix multiply up
> to size 50, java is competitive. If you get up to roughly n = 500 or 1000,
> then heavily optimized native code can be up to 3x faster. Note the line in
> the third graph for colt (a reasonably well written pure java
> implementation) and MTJ (which is running in pure java mode here). In my
> case, I will generally opt for portability, but I like to have a portable
> option for speed. It is also important to remember that numerical codes
> more often need blinding speed than most other applications.
>
As Luc said, commonsmath aims to be a generalpurpose applied math
package implementing good, welldocumented, unencumbered numerical
algorithms. I think this can be done in Java and we are doing it. We
are never going to compete with optimized native code in speed, but
strong numerics, JRE improvements and Moore's law are rapidly shrinking
the class of realworld applications where the 3x difference above is
material.
Phil
> http://blog.mikiobraun.de/2009/04/somebenchmarknumbersforjblas.html>
> Is that optional dependency really all that bad?
>
> On Fri, May 15, 2009 at 6:23 PM, Phil Steitz < [hidden email]> wrote:
>
>
>> [math] also has currently zero dependencies. We had two dependencies on
>>
>>> other commons components up to version 1.2 and removed them when we
>>> started work on version 2.0. Adding new dependencies, and especially
>>> dependencies that involve native libraries is a difficult decision that
>>> needs lots of discussion. We are currently trying to have 2.0 published
>>> very soon now, such a decision would delay the publication several months.
>>>
>>>
>>>
>> 1 for adding dependencies, especially on native code. Commons math needs
>> to remain
>>
>> 1) ASL licensed
>> 2) selfcontained
>> 3) fully documented, full open source
>>
>
>
>
>
>

To unsubscribe, email: [hidden email]
For additional commands, email: [hidden email]


Phil Steitz a écrit :
> Ted Dunning wrote:
>> Phil, I think we have much of the same desires and motivations, but we
>> seem
>> to come to somewhat, but not entirely different conclusions.
>>
>> Assuming that (1) can be dealt with and assuming that (3) is already
>> dealt
>> with, do you still mind the inclusion of *optional*, automatically
>> generated
>> native code?
>>
> Part of 3) is having full code available with the package for
> inspection. That is part of the reason that we have avoided external
> dependencies. I would be open to making our fully selfcontained, fully
> documented, fully open source library extensible to use other libraries,
> including native libraries, but I would not want to distribute anything
> associated with external libraries. The reason for this is the
> commitment made early on that all numerics and algorithms would be
> immediately visible to the user  no chasing down external, possibly
> incomplete or ambiguous docs to figure out what our code is doing.
I have an additional reason for avoiding native libraries. Pure Java can
be processed by external tools for either inspection (think findbugs,
cobertura, traceability, auditing) or modification (think Nabla!). The
Nabla case is especially important to me, but I am aware this is a
cornercase.
>> This page has some useful speed comparisons. For matrixmatrix
>> multiply up
>> to size 50, java is competitive. If you get up to roughly n = 500 or
>> 1000,
>> then heavily optimized native code can be up to 3x faster. Note the
>> line in
>> the third graph for colt (a reasonably well written pure java
>> implementation) and MTJ (which is running in pure java mode here). In my
>> case, I will generally opt for portability, but I like to have a portable
>> option for speed. It is also important to remember that numerical codes
>> more often need blinding speed than most other applications.
>>
> As Luc said, commonsmath aims to be a generalpurpose applied math
> package implementing good, welldocumented, unencumbered numerical
> algorithms. I think this can be done in Java and we are doing it. We
> are never going to compete with optimized native code in speed, but
> strong numerics, JRE improvements and Moore's law are rapidly shrinking
> the class of realworld applications where the 3x difference above is
> material.
Perhaps we should have some benchmarks including our new linear package.
Something more serious than my little experiement with QR decomposition.
Unfortunately, I clearly have no time for it now. My current priotity is
to publish 2.0 as soon as possible and I am already late on my own schedule.
Luc
>
> Phil
>
>
>> http://blog.mikiobraun.de/2009/04/somebenchmarknumbersforjblas.html>>
>> Is that optional dependency really all that bad?
>>
>> On Fri, May 15, 2009 at 6:23 PM, Phil Steitz < [hidden email]>
>> wrote:
>>
>>
>>> [math] also has currently zero dependencies. We had two dependencies on
>>>
>>>> other commons components up to version 1.2 and removed them when we
>>>> started work on version 2.0. Adding new dependencies, and especially
>>>> dependencies that involve native libraries is a difficult decision that
>>>> needs lots of discussion. We are currently trying to have 2.0 published
>>>> very soon now, such a decision would delay the publication several
>>>> months.
>>>>
>>>>
>>>>
>>> 1 for adding dependencies, especially on native code. Commons math
>>> needs
>>> to remain
>>>
>>> 1) ASL licensed
>>> 2) selfcontained
>>> 3) fully documented, full open source
>>>
>>
>>
>>
>>
>>
>
>
> 
> To unsubscribe, email: [hidden email]
> For additional commands, email: [hidden email]
>
>

To unsubscribe, email: [hidden email]
For additional commands, email: [hidden email]


I've asked Bjorn about an Apache license for MTJ and his reply was
"Yes, I don't see why not. The more users/developers, the better."
Ted Dunning wrote
On Thu, May 14, 2009 at 1:54 PM, Sam Halliday wrote:
> I personally have no problems with my MTJ contributions being released
> Apache. BjornOve is the person to talk to about the bulk of MTJ. I'll ask
> him!
>
Great.


Luc Maisonobe wrote
Ted Dunning a écrit :
> As a start, I'd like to discourage the use of a solid implementation for
>> SparseReal{Vector, Matrix}... please prefer an interface approach, allowing
>> implementations based on the Templates project:
Maybe SparseRealMatrix was a bad name and should have been
SimpleSparseRealMatrix to avoid confusion with other sparse storage and
dedicated algorithms.
I give a +1 for renaming SparseReal{Matrix, Vector}! These names should be reserved for interfaces (which might be methodless) indicating that the implementation storage needs to be sparse.
Luc Maisonobe wrote
> Can you say something about the licensing issues if we were to explore, for
> discussion sake, MTJ being folded into commonsmath? MTJ is LGPL while
> commons has to stay Apache licensed. This licensing issue has been the
> biggest sticking point in the past.
This is really an issue. Apache projects cannot use LGPL (or GPL) code.
See http://www.apache.org/legal/resolved.html for the policy.
Solved! See other message. Both myself and (more importantly, because he wrote MTJ) Bjorn are willing to use Apache license.
Luc Maisonobe wrote
Adding new dependencies, and especially dependencies that involve native libraries is a difficult decision that needs lots of discussion.
MTJ depends only on netlibjava, *which does not depend on any native libs*. The option is there to add native optimised libs if the end user wants to.
Luc Maisonobe wrote
Some benchmarks I did a few weeks ago showed the new [math] linear
package implementation was quite fast and compared very well with native
fortran libraries
I'm going to call "foul" here :)
The Java implementation of netlibjava is just as fast as machineoptimised BLAS/LAPACK... but only for matrices smaller than roughly 1000 x 1000 elements AND ONLY FOR NORMAL DESKTOPS MACHINES! The important distinction here is that hardware exists with crazy optimisations for the BLAS/LAPACK API and having the option to use that architecture from within Java is a great bonus. Consider, for example, a dedicated GPU (or FPGA) card which comes with a BLAS/LAPACK binary.
Additionally, the BLAS/LAPACK API is universally accepted. It would be a mistake to attempt to reproduce all the brain power and agreement that has worked toward it.
Luc Maisonobe wrote
I am aware that we still lack lots of very efficient linear algebra
algorithms. Joining efforts with you would be a real gain if we can
solve the licensing issues and avoid new dependencies if possible.
I am very keen to consolidate efforts! I think the next step is perhaps for you to have a look through the MTJ API and create a wishlist of everything you think would make sense to appear in commonsmath. Even if adopted "wholesale", I would still strongly recommend a review of the API. e.g. some interfaces extend Serializable (a mistake) ; I'm not entirely sure how relevant the distributed package is nowadays; the Matrix Market IO is difficult to understand/use ; there should perhaps be a "factory pattern" to instantiating matrices/vectors.
In the meantime, I recommend holding off a 2.0 API release with any new linear classes. That way we can stabilise the "new" merged API... releasing that as part of 2.1.


Ted, thanks for pointing this out... I'd never seen it before. Glad MTJ did so well and I note that this isn't even with the optional native BLAS/LAPACK :)


I've somehow missed much of this discussion, which has got a little confused. I'll repeat some key facts here:
 MTJ depends on netlibjava
 I'm the maintainer of netlibjava
 netlibjava depends on PURE JAVA code, generated by F2J from netlib.org BLAS/LAPACK (and ARPACK). Keith Seymour (author of f2j) deserves all the praise for that magnificent task! The necessary jar is distributed with netlibjava.
 BLAS/LAPACK are industry standard APIs.
 netlibjava is technically a "translation" of netlib.org's BLAS/LAPACK/ARPACK API, so is therefore BSD licensed
 netlibjava can be *optionally* configured at runtime to use a native library instead of the Java implementation.
 the java implementation is pretty damn fast and will be more than adequate for most users. However, it will *never* be as fast as native code running on specialist hardware (no matter how much the JVM improves).
Being the maintainer of netlibjava, I'd be more than happy to relicense all the bits that aren't technically "translations" of netlib.org, for inclusion in commonsmath (in fact, it makes sense to do so). But you'd still need to depend on the f2j translated implementation. They are BSD license.
Hell, it makes a *lot* of sense for commonsmath to provide the BLAS/LAPACK API... they are industry standards after all, and all reference implementations for linear algebra algorithms make use of them.
Luc Maisonobe wrote
I have an additional reason for avoiding native libraries. Pure Java can
be processed by external tools for either inspection (think findbugs,
cobertura, traceability, auditing) or modification (think Nabla!). The
Nabla case is especially important to me, but I am aware this is a
cornercase.


Just to let you know, I've contacted the author of this blog post... who has recently written a library called jblas. I've asked him if he wants to be involved with the initiative here, to consolidate efforts for Java Linear Algebra packages.
Incidentally... this blog post references a very pervasive, yet abandoned, project named Colt. Colt was a brilliant library in its day (now numerically challenged), although riddled with license issues (depending on noncommercial and illdefined notformilitaryuse middleware). Colt is a reminder of what can happen when a great library is written but not maintained. There might be lessons to learn from their API... I know some projects that use it.
It might be worthwhile contacting other Java Linear Algebra package authors, such as JAMA. JAMA is a very small library in comparison (no additional functionality over MTJ or commonsmath)... but they might have a different take on APIs than we would have.


Sam Halliday a écrit :
> I've somehow missed much of this discussion, which has got a little confused.
> I'll repeat some key facts here:
>
>  MTJ depends on netlibjava
>  I'm the maintainer of netlibjava
>  netlibjava depends on PURE JAVA code, generated by F2J from netlib.org
> BLAS/LAPACK (and ARPACK). Keith Seymour (author of f2j) deserves all the
> praise for that magnificent task! The necessary jar is distributed with
> netlibjava.
>  BLAS/LAPACK are industry standard APIs.
>  netlibjava is technically a "translation" of netlib.org's
> BLAS/LAPACK/ARPACK API, so is therefore BSD licensed
>  netlibjava can be *optionally* configured at runtime to use a native
> library instead of the Java implementation.
>  the java implementation is pretty damn fast and will be more than adequate
> for most users. However, it will *never* be as fast as native code running
> on specialist hardware (no matter how much the JVM improves).
>
> Being the maintainer of netlibjava, I'd be more than happy to relicense
> all the bits that aren't technically "translations" of netlib.org, for
> inclusion in commonsmath (in fact, it makes sense to do so). But you'd
> still need to depend on the f2j translated implementation. They are BSD
> license.
This is becoming more and more interesting. However, do yo think it
would be possible to "include" the source (either manually written or
automatically translated) into [math] ? This would allow a
selfcontained package.
We already provide some code which technically comes from translated
netlib routines, for example part of the LevenbergMarquardt or almost
everything in the singular value decomposition. The Netlib license
allows that and we have set up the appropriate notices (see the javadoc
and the NOTICE.txt file).
>
> Hell, it makes a *lot* of sense for commonsmath to provide the BLAS/LAPACK
> API... they are industry standards after all, and all reference
> implementations for linear algebra algorithms make use of them.
I strongly approve that for BLAS. I dream of the BLAS API being
mandatory in JVM implementations, but this will probably never happen.
Considering LAPACK, I am less convinced because the API is strongly
fortranoriented, not using some of the objectoriented features that
are well suited for mathematical concepts. The algorithms and their
implementations are very good, and we already use them inside, but with
a different API.
Luc
>
>
> Luc Maisonobe wrote:
>> I have an additional reason for avoiding native libraries. Pure Java can
>> be processed by external tools for either inspection (think findbugs,
>> cobertura, traceability, auditing) or modification (think Nabla!). The
>> Nabla case is especially important to me, but I am aware this is a
>> cornercase.
>>
>

To unsubscribe, email: [hidden email]
For additional commands, email: [hidden email]


Luc Maisonobe wrote:
> Phil Steitz a écrit :
>
>> Ted Dunning wrote:
>>
>>> Phil, I think we have much of the same desires and motivations, but we
>>> seem
>>> to come to somewhat, but not entirely different conclusions.
>>>
>>> Assuming that (1) can be dealt with and assuming that (3) is already
>>> dealt
>>> with, do you still mind the inclusion of *optional*, automatically
>>> generated
>>> native code?
>>>
>>>
>> Part of 3) is having full code available with the package for
>> inspection. That is part of the reason that we have avoided external
>> dependencies. I would be open to making our fully selfcontained, fully
>> documented, fully open source library extensible to use other libraries,
>> including native libraries, but I would not want to distribute anything
>> associated with external libraries. The reason for this is the
>> commitment made early on that all numerics and algorithms would be
>> immediately visible to the user  no chasing down external, possibly
>> incomplete or ambiguous docs to figure out what our code is doing.
>>
>
> I have an additional reason for avoiding native libraries. Pure Java can
> be processed by external tools for either inspection (think findbugs,
> cobertura, traceability, auditing) or modification (think Nabla!). The
> Nabla case is especially important to me, but I am aware this is a
> cornercase.
>
>
>>> This page has some useful speed comparisons. For matrixmatrix
>>> multiply up
>>> to size 50, java is competitive. If you get up to roughly n = 500 or
>>> 1000,
>>> then heavily optimized native code can be up to 3x faster. Note the
>>> line in
>>> the third graph for colt (a reasonably well written pure java
>>> implementation) and MTJ (which is running in pure java mode here). In my
>>> case, I will generally opt for portability, but I like to have a portable
>>> option for speed. It is also important to remember that numerical codes
>>> more often need blinding speed than most other applications.
>>>
>>>
>> As Luc said, commonsmath aims to be a generalpurpose applied math
>> package implementing good, welldocumented, unencumbered numerical
>> algorithms. I think this can be done in Java and we are doing it. We
>> are never going to compete with optimized native code in speed, but
>> strong numerics, JRE improvements and Moore's law are rapidly shrinking
>> the class of realworld applications where the 3x difference above is
>> material.
>>
>
> Perhaps we should have some benchmarks including our new linear package.
> Something more serious than my little experiement with QR decomposition.
> Unfortunately, I clearly have no time for it now. My current priotity is
> to publish 2.0 as soon as possible and I am already late on my own schedule.
>
+1 for getting 2.0 released ASAP. This is long overdue and we need to
stay focussed on getting it out.
Phil
> Luc
>
>
>> Phil
>>
>>
>>
>>> http://blog.mikiobraun.de/2009/04/somebenchmarknumbersforjblas.html>>>
>>> Is that optional dependency really all that bad?
>>>
>>> On Fri, May 15, 2009 at 6:23 PM, Phil Steitz < [hidden email]>
>>> wrote:
>>>
>>>
>>>
>>>> [math] also has currently zero dependencies. We had two dependencies on
>>>>
>>>>
>>>>> other commons components up to version 1.2 and removed them when we
>>>>> started work on version 2.0. Adding new dependencies, and especially
>>>>> dependencies that involve native libraries is a difficult decision that
>>>>> needs lots of discussion. We are currently trying to have 2.0 published
>>>>> very soon now, such a decision would delay the publication several
>>>>> months.
>>>>>
>>>>>
>>>>>
>>>>>
>>>> 1 for adding dependencies, especially on native code. Commons math
>>>> needs
>>>> to remain
>>>>
>>>> 1) ASL licensed
>>>> 2) selfcontained
>>>> 3) fully documented, full open source
>>>>
>>>>
>>>
>>>
>>>
>>>
>> 
>> To unsubscribe, email: [hidden email]
>> For additional commands, email: [hidden email]
>>
>>
>>
>
>
> 
> To unsubscribe, email: [hidden email]
> For additional commands, email: [hidden email]
>
>

To unsubscribe, email: [hidden email]
For additional commands, email: [hidden email]


Luc, if the Apache team are happy to include source generated by f2j (which is therefore BSD license) then there is no reason at all to have a dependency!
The generator code from netlibjava need not be distributed as part of the final commonsmath binary, it is only needed to generate the .c files which allow for a native library at runtime. I would foresee the .c files being distributed as part of the commonsmath binary download, with instructions on how to build the optional native library. The entire mechanism for doing this is entirely up for debate and review. The important thing is that there be a standardised BLAS/LAPACK API available.
Luc Maisonobe wrote
Sam Halliday a écrit :
> I've somehow missed much of this discussion, which has got a little confused.
> I'll repeat some key facts here:
>
>  MTJ depends on netlibjava
>  I'm the maintainer of netlibjava
>  netlibjava depends on PURE JAVA code, generated by F2J from netlib.org
> BLAS/LAPACK (and ARPACK). Keith Seymour (author of f2j) deserves all the
> praise for that magnificent task! The necessary jar is distributed with
> netlibjava.
>  BLAS/LAPACK are industry standard APIs.
>  netlibjava is technically a "translation" of netlib.org's
> BLAS/LAPACK/ARPACK API, so is therefore BSD licensed
>  netlibjava can be *optionally* configured at runtime to use a native
> library instead of the Java implementation.
>  the java implementation is pretty damn fast and will be more than adequate
> for most users. However, it will *never* be as fast as native code running
> on specialist hardware (no matter how much the JVM improves).
>
> Being the maintainer of netlibjava, I'd be more than happy to relicense
> all the bits that aren't technically "translations" of netlib.org, for
> inclusion in commonsmath (in fact, it makes sense to do so). But you'd
> still need to depend on the f2j translated implementation. They are BSD
> license.
This is becoming more and more interesting. However, do yo think it
would be possible to "include" the source (either manually written or
automatically translated) into [math] ? This would allow a
selfcontained package.
We already provide some code which technically comes from translated
netlib routines, for example part of the LevenbergMarquardt or almost
everything in the singular value decomposition. The Netlib license
allows that and we have set up the appropriate notices (see the javadoc
and the NOTICE.txt file).
>
> Hell, it makes a *lot* of sense for commonsmath to provide the BLAS/LAPACK
> API... they are industry standards after all, and all reference
> implementations for linear algebra algorithms make use of them.
I strongly approve that for BLAS. I dream of the BLAS API being
mandatory in JVM implementations, but this will probably never happen.
Considering LAPACK, I am less convinced because the API is strongly
fortranoriented, not using some of the objectoriented features that
are well suited for mathematical concepts. The algorithms and their
implementations are very good, and we already use them inside, but with
a different API.
Luc
>
>
> Luc Maisonobe wrote:
>> I have an additional reason for avoiding native libraries. Pure Java can
>> be processed by external tools for either inspection (think findbugs,
>> cobertura, traceability, auditing) or modification (think Nabla!). The
>> Nabla case is especially important to me, but I am aware this is a
>> cornercase.
>>
>

To unsubscribe, email: devunsubscribe@commons.apache.org
For additional commands, email: devhelp@commons.apache.org


I think netlibjava might actually be using the CLAPACK version of LAPACK... the biggest problem with C/Fortran is the array indexing is different for double[][]. CLAPACK addresses this.
LAPACK is still heavily used in reference implementations of standard algorithms, although admittedly not as *core* as BLAS. The ARPACK API is also worthwhile considering for inclusion (it's part of netlibjava and f2j's translations).
Luc Maisonobe wrote
I strongly approve that for BLAS. I dream of the BLAS API being
mandatory in JVM implementations, but this will probably never happen.
Considering LAPACK, I am less convinced because the API is strongly
fortranoriented, not using some of the objectoriented features that
are well suited for mathematical concepts. The algorithms and their
implementations are very good, and we already use them inside, but with
a different API.

12345
