Dear all,
I am a maintainer of the matrixtoolkitsjava http://code.google.com/p/matrixtoolkitsjava/ which is a comprehensive collection of matrix data structures, linear solvers, least squares methods, eigenvalue and singular value decompositions. This note is in regard to the commonsmath library. It is clear that our projects dovetail, especially when I look at "linear" in version 2.0 of the API. It would be good if we could either complement or consolidate efforts, rather than reproduce. It would be excellent if all the functionality of matrixtoolkitsjava were available as part of commonsmath. There is already too much diversity and unmaintained maths code out there for Java! As a start, I'd like to discourage the use of a solid implementation for SparseReal{Vector, Matrix}... please prefer an interface approach, allowing implementations based on the Templates project: http://www.netlib.org/templates The reason is that the storage implementation should be related to the type of data being stored. For example, there are many wellknown kinds of sparse matrix that are well suited to particular kinds of calculations... consider multiplying sparse matrices that you know to be diagonal! In general, the netlib.org folk (BLAS/LAPACK) have spent a *lot* of time thinking about linear algebra and have set up unrivalled standard APIs which have been implemented right down to the architecture level. It would be a major mistake if commonsmath didn't build on their good work. I believe commonsmath should move to a netlibjava backend (allowing the use of machine optimised BLAS/LAPACK). http://code.google.com/p/netlibjava/ The largest problems facing MTJ are support for Sparse BLAS/LAPACK and scalability to parallel architectures which use Parallel BLAS/LAPACK. The former should be possible with some work within the current API, but I fear major API changes would be needed for the latter. I do not want the commonsmath API to walk into this trap without having first considered future architectures! MTJ has a distributed package, but I am not sure if this is something that is completely future proof either. What say ye'?  Sam 
On Thu, May 14, 2009 at 3:18 AM, Sam Halliday <[hidden email]>wrote:
> > I am a maintainer of the matrixtoolkitsjava Which is an impressive piece of work, especially the transparent but nonbinding interface to the Atlas and Blas native packages. My compliments to BjørnOve and all who have followed up on his original work. This note is in regard to the commonsmath library. It is clear that our > projects dovetail, especially when I look at "linear" in version 2.0 of the > API. It would be good if we could either complement or consolidate efforts, > rather than reproduce. That sounds good to me. As a start, I'd like to discourage the use of a solid implementation for > SparseReal{Vector, Matrix}... please prefer an interface approach, allowing > implementations based on the Templates project: Can you say more about what aspects of the Templates project you feel are important? You mention one case of storage layout. > I believe commonsmath should move to a netlibjava backend (allowing the > use of machine optimised BLAS/LAPACK). This is an interesting suggestion. Obviously adopting MTJ wholesale would accomplish that. Can you say something about the licensing issues if we were to explore, for discussion sake, MTJ being folded into commonsmath? MTJ is LGPL while commons has to stay Apache licensed. This licensing issue has been the biggest sticking point in the past. 
Ted Dunning a écrit :
> On Thu, May 14, 2009 at 3:18 AM, Sam Halliday <[hidden email]>wrote: > >> I am a maintainer of the matrixtoolkitsjava > > > Which is an impressive piece of work, especially the transparent but > nonbinding interface to the Atlas and Blas native packages. My compliments > to BjørnOve and all who have followed up on his original work. > > This note is in regard to the commonsmath library. It is clear that our >> projects dovetail, especially when I look at "linear" in version 2.0 of the >> API. It would be good if we could either complement or consolidate efforts, >> rather than reproduce. > > > That sounds good to me. You are right. > > As a start, I'd like to discourage the use of a solid implementation for >> SparseReal{Vector, Matrix}... please prefer an interface approach, allowing >> implementations based on the Templates project: This is exactly the purpose of the RealMatrix/RealVector and FieldMatrix /FieldVector interfaces on one side and of the CholeskiDecomposition, EigenDecomposition, LUDecomposition, QRDecomposition, SingularValueDecomposition and Decompositionsolver interfaces on the other side. RealMatrix (resp. FieldMatrix) is the top level interfaces that does not mandate any specific storage. It has several implementations (RealMatrixImpl with a simple double[][] array, DenseRealMatrix with a block implementation, SparseRealMatrix). We also have in mind (not implemented yet) things like DiagonalRealMatrix or BandRealMatrix or Lower/UpperTriangularRealMatrix. All implementations can be mixed together and specific cases are automatically detected and handled to avoid using the general embedded loops and use smart algorithms when possible. There was also an attempt using recursive layouts (with GrayMorton space filling curve). Maybe SparseRealMatrix was a bad name and should have been SimpleSparseRealMatrix to avoid confusion with other sparse storage and dedicated algorithms. > > > Can you say more about what aspects of the Templates project you feel are > important? You mention one case of storage layout. > > >> I believe commonsmath should move to a netlibjava backend (allowing the >> use of machine optimised BLAS/LAPACK). > > > This is an interesting suggestion. Obviously adopting MTJ wholesale would > accomplish that. > > Can you say something about the licensing issues if we were to explore, for > discussion sake, MTJ being folded into commonsmath? MTJ is LGPL while > commons has to stay Apache licensed. This licensing issue has been the > biggest sticking point in the past. This is really an issue. Apache projects cannot use LGPL (or GPL) code. See http://www.apache.org/legal/resolved.html for the policy. [math] also has currently zero dependencies. We had two dependencies on other commons components up to version 1.2 and removed them when we started work on version 2.0. Adding new dependencies, and especially dependencies that involve native libraries is a difficult decision that needs lots of discussion. We are currently trying to have 2.0 published very soon now, such a decision would delay the publication several months. Some benchmarks I did a few weeks ago showed the new [math] linear package implementation was quite fast and compared very well with native fortran libraries for QR decomposition with similar nonblocked algorithms. In fact, it was 7 times faster than unoptimized numerical recipes, about 2 or 3 times faster than optimized numerical recipes, and very slightly (a few percents) faster than optimized lapack with Atlas as the BLAS implementation. Faster QR decomposition required changing the algorithm so the blocked lapack implementation was the only native implementation faster than [math]. Of course, I now do want to also implement a blocked QR decomposition in [math] ... I am aware that we still lack lots of very efficient linear algebra algorithms. Joining efforts with you would be a real gain if we can solve the licensing issues and avoid new dependencies if possible. [math] has already adopted a lot of external code, even complete libraries. I came in by donating the whole mantissa library, merging it in and now contributing to the maintainance of the component with the other developers. Luc >  To unsubscribe, email: [hidden email] For additional commands, email: [hidden email] 
Dang.
That is fast. What size matrices was this for? On Thu, May 14, 2009 at 12:09 PM, Luc Maisonobe <[hidden email]>wrote: > Some benchmarks I did a few weeks ago showed the new [math] linear > package implementation was quite fast and compared very well with native > fortran libraries for QR decomposition with similar nonblocked > algorithms. In fact, it was 7 times faster than unoptimized numerical > recipes, about 2 or 3 times faster than optimized numerical recipes, and > very slightly (a few percents) faster than optimized lapack with Atlas > as the BLAS implementation. Faster QR decomposition required changing > the algorithm so the blocked lapack implementation was the only native > implementation faster than [math]. > 
Ted Dunning a écrit :
> Dang. > > That is fast. > > What size matrices was this for? > > On Thu, May 14, 2009 at 12:09 PM, Luc Maisonobe <[hidden email]>wrote: > >> Some benchmarks I did a few weeks ago showed the new [math] linear >> package implementation was quite fast and compared very well with native >> fortran libraries for QR decomposition with similar nonblocked >> algorithms. In fact, it was 7 times faster than unoptimized numerical >> recipes, about 2 or 3 times faster than optimized numerical recipes, and >> very slightly (a few percents) faster than optimized lapack with Atlas >> as the BLAS implementation. Faster QR decomposition required changing >> the algorithm so the blocked lapack implementation was the only native >> implementation faster than [math]. >> > Medium size, up to 600 if I remember correctly (I don't have the curve available here). The pattern with O(n^3) was clearly visible so the result can be be extrapolated for medium size I think. I did not check larger matrices like 3000 or more, results may be different. Beware, that was one algorithm only: QR decomposition and on one host only (AMD64 phenom quad core, Linux, sun Java 6 on one side, Gnu fortran on the other side). GNU fortran is not a very fast compiler so the result would obviously be very different with a better compiler. The purpose of this test was not to say that Java is faster, it was to show that the performance difference between Java and fortran was smaller than the difference you get by changing algorithms (in this case blocked vs. nonblocked). I was surprised by the results. For other kind of algorithms, mixing linear algebra, trigonometric computation, ODE integration, root searching ... I often see performance differences of about a factor 2, which in my opinion is a factor similar to other changes one can do (algorithms, CPU, BLAS, compiler, parallelism, memory ...). Of course, when comparing an highly optimized algorithm with the perfect cache size, and the best compiler to a standard setting we can get very different factors. So do not take these numbers for granted. Trying to get the fastest computation is not the purpose of [math]. It should be reasonably efficient with respect to other libraries, including native ones, but is not dedicated to speed. It should remain a generalpurpose library. Luc  To unsubscribe, email: [hidden email] For additional commands, email: [hidden email] 
In reply to this post by Ted Dunning
Replies inline:
It's difficult to say which algorithms from Templates ones are the most important ones, but in most cases reference implementations already exist (usually in fortran) and should be preferred (e.g. by using f2j with a wrapper layer): the theory can be quite involved. MTJ only touches the surface! However, an important step is recognising that there are not just "dense" and "sparse" matrices... but whole classes of structured sparse matrices. I personally have no problems with my MTJ contributions being released Apache. BjornOve is the person to talk to about the bulk of MTJ. I'll ask him! MTJ depends on netlibjava, which is technically a translation of the original netlib libraries. They are BSD license. I seriously doubt you'll get them to give you the right to redistribute as, so you'll have to decide if that's a blocker. What would "adopting wholesale" mean? It would be a good opportunity to review/revise parts of the API and find duplication with the rest of the commonsmath project. 
On Thu, May 14, 2009 at 1:54 PM, Sam Halliday <[hidden email]>wrote:
> I personally have no problems with my MTJ contributions being released > Apache. BjornOve is the person to talk to about the bulk of MTJ. I'll ask > him! > Great. MTJ depends on netlibjava, which is technically a translation of the > original netlib libraries. They are BSD license. I seriously doubt you'll > get them to give you the right to redistribute as, so you'll have to decide > if that's a blocker. > If they really are BSD, then there should be no problem. BSD allows redistribution with attribution, preservation of the copyright notice and no implication of endorsement. > What would "adopting wholesale" mean? It would be a good opportunity to > review/revise parts of the API and find duplication with the rest of the > commonsmath project. > There is a big issue with dependencies, but a much smaller issue with major source code contributions. Essentially what I mean by "adopting wholesale" would be for commons math to ingest MTJ. IN the best world, the contributor communities would merge as well. There would still be plenty of issues such as the conditional dependency on native libraries. I am not sure how that should play out.  Ted Dunning, CTO DeepDyve 111 West Evelyn Ave. Ste. 202 Sunnyvale, CA 94086 www.deepdyve.com 8584140013 (m) 4087730220 (fax) 
In reply to this post by Luc Maisonobe
Luc Maisonobe wrote:
> Ted Dunning a écrit : > >> On Thu, May 14, 2009 at 3:18 AM, Sam Halliday <[hidden email]>wrote: >> >> >>> I am a maintainer of the matrixtoolkitsjava >>> >> Which is an impressive piece of work, especially the transparent but >> nonbinding interface to the Atlas and Blas native packages. My compliments >> to BjørnOve and all who have followed up on his original work. >> >> This note is in regard to the commonsmath library. It is clear that our >> >>> projects dovetail, especially when I look at "linear" in version 2.0 of the >>> API. It would be good if we could either complement or consolidate efforts, >>> rather than reproduce. >>> >> That sounds good to me. >> > > You are right. > > >> As a start, I'd like to discourage the use of a solid implementation for >> >>> SparseReal{Vector, Matrix}... please prefer an interface approach, allowing >>> implementations based on the Templates project: >>> > > This is exactly the purpose of the RealMatrix/RealVector and FieldMatrix > /FieldVector interfaces on one side and of the CholeskiDecomposition, > EigenDecomposition, LUDecomposition, QRDecomposition, > SingularValueDecomposition and Decompositionsolver interfaces on the > other side. > > RealMatrix (resp. FieldMatrix) is the top level interfaces that does not > mandate any specific storage. It has several implementations > (RealMatrixImpl with a simple double[][] array, DenseRealMatrix with a > block implementation, SparseRealMatrix). We also have in mind (not > implemented yet) things like DiagonalRealMatrix or BandRealMatrix or > Lower/UpperTriangularRealMatrix. All implementations can be mixed > together and specific cases are automatically detected and handled to > avoid using the general embedded loops and use smart algorithms when > possible. There was also an attempt using recursive layouts (with > GrayMorton space filling curve). > > Maybe SparseRealMatrix was a bad name and should have been > SimpleSparseRealMatrix to avoid confusion with other sparse storage and > dedicated algorithms. > > >> Can you say more about what aspects of the Templates project you feel are >> important? You mention one case of storage layout. >> >> >> >>> I believe commonsmath should move to a netlibjava backend (allowing the >>> use of machine optimised BLAS/LAPACK). >>> >> This is an interesting suggestion. Obviously adopting MTJ wholesale would >> accomplish that. >> >> Can you say something about the licensing issues if we were to explore, for >> discussion sake, MTJ being folded into commonsmath? MTJ is LGPL while >> commons has to stay Apache licensed. This licensing issue has been the >> biggest sticking point in the past. >> > > This is really an issue. Apache projects cannot use LGPL (or GPL) code. > See http://www.apache.org/legal/resolved.html for the policy. > > [math] also has currently zero dependencies. We had two dependencies on > other commons components up to version 1.2 and removed them when we > started work on version 2.0. Adding new dependencies, and especially > dependencies that involve native libraries is a difficult decision that > needs lots of discussion. We are currently trying to have 2.0 published > very soon now, such a decision would delay the publication several months. > needs to remain 1) ASL licensed 2) selfcontained 3) fully documented, full open source Phil > Some benchmarks I did a few weeks ago showed the new [math] linear > package implementation was quite fast and compared very well with native > fortran libraries for QR decomposition with similar nonblocked > algorithms. In fact, it was 7 times faster than unoptimized numerical > recipes, about 2 or 3 times faster than optimized numerical recipes, and > very slightly (a few percents) faster than optimized lapack with Atlas > as the BLAS implementation. Faster QR decomposition required changing > the algorithm so the blocked lapack implementation was the only native > implementation faster than [math]. Of course, I now do want to also > implement a blocked QR decomposition in [math] ... > > I am aware that we still lack lots of very efficient linear algebra > algorithms. Joining efforts with you would be a real gain if we can > solve the licensing issues and avoid new dependencies if possible. > [math] has already adopted a lot of external code, even complete > libraries. I came in by donating the whole mantissa library, merging it > in and now contributing to the maintainance of the component with the > other developers. > > Luc > > > > >  > To unsubscribe, email: [hidden email] > For additional commands, email: [hidden email] > >  To unsubscribe, email: [hidden email] For additional commands, email: [hidden email] 
Phil, I think we have much of the same desires and motivations, but we seem
to come to somewhat, but not entirely different conclusions. Assuming that (1) can be dealt with and assuming that (3) is already dealt with, do you still mind the inclusion of *optional*, automatically generated native code? This page has some useful speed comparisons. For matrixmatrix multiply up to size 50, java is competitive. If you get up to roughly n = 500 or 1000, then heavily optimized native code can be up to 3x faster. Note the line in the third graph for colt (a reasonably well written pure java implementation) and MTJ (which is running in pure java mode here). In my case, I will generally opt for portability, but I like to have a portable option for speed. It is also important to remember that numerical codes more often need blinding speed than most other applications. http://blog.mikiobraun.de/2009/04/somebenchmarknumbersforjblas.html Is that optional dependency really all that bad? On Fri, May 15, 2009 at 6:23 PM, Phil Steitz <[hidden email]> wrote: > [math] also has currently zero dependencies. We had two dependencies on >> other commons components up to version 1.2 and removed them when we >> started work on version 2.0. Adding new dependencies, and especially >> dependencies that involve native libraries is a difficult decision that >> needs lots of discussion. We are currently trying to have 2.0 published >> very soon now, such a decision would delay the publication several months. >> >> > 1 for adding dependencies, especially on native code. Commons math needs > to remain > > 1) ASL licensed > 2) selfcontained > 3) fully documented, full open source  Ted Dunning, CTO DeepDyve 
Ted Dunning wrote:
> Phil, I think we have much of the same desires and motivations, but we seem > to come to somewhat, but not entirely different conclusions. > > Assuming that (1) can be dealt with and assuming that (3) is already dealt > with, do you still mind the inclusion of *optional*, automatically generated > native code? > Part of 3) is having full code available with the package for inspection. That is part of the reason that we have avoided external dependencies. I would be open to making our fully selfcontained, fully documented, fully open source library extensible to use other libraries, including native libraries, but I would not want to distribute anything associated with external libraries. The reason for this is the commitment made early on that all numerics and algorithms would be immediately visible to the user  no chasing down external, possibly incomplete or ambiguous docs to figure out what our code is doing. > This page has some useful speed comparisons. For matrixmatrix multiply up > to size 50, java is competitive. If you get up to roughly n = 500 or 1000, > then heavily optimized native code can be up to 3x faster. Note the line in > the third graph for colt (a reasonably well written pure java > implementation) and MTJ (which is running in pure java mode here). In my > case, I will generally opt for portability, but I like to have a portable > option for speed. It is also important to remember that numerical codes > more often need blinding speed than most other applications. > As Luc said, commonsmath aims to be a generalpurpose applied math package implementing good, welldocumented, unencumbered numerical algorithms. I think this can be done in Java and we are doing it. We are never going to compete with optimized native code in speed, but strong numerics, JRE improvements and Moore's law are rapidly shrinking the class of realworld applications where the 3x difference above is material. Phil > http://blog.mikiobraun.de/2009/04/somebenchmarknumbersforjblas.html > > Is that optional dependency really all that bad? > > On Fri, May 15, 2009 at 6:23 PM, Phil Steitz <[hidden email]> wrote: > > >> [math] also has currently zero dependencies. We had two dependencies on >> >>> other commons components up to version 1.2 and removed them when we >>> started work on version 2.0. Adding new dependencies, and especially >>> dependencies that involve native libraries is a difficult decision that >>> needs lots of discussion. We are currently trying to have 2.0 published >>> very soon now, such a decision would delay the publication several months. >>> >>> >>> >> 1 for adding dependencies, especially on native code. Commons math needs >> to remain >> >> 1) ASL licensed >> 2) selfcontained >> 3) fully documented, full open source >> > > > > >  To unsubscribe, email: [hidden email] For additional commands, email: [hidden email] 
Phil Steitz a écrit :
> Ted Dunning wrote: >> Phil, I think we have much of the same desires and motivations, but we >> seem >> to come to somewhat, but not entirely different conclusions. >> >> Assuming that (1) can be dealt with and assuming that (3) is already >> dealt >> with, do you still mind the inclusion of *optional*, automatically >> generated >> native code? >> > Part of 3) is having full code available with the package for > inspection. That is part of the reason that we have avoided external > dependencies. I would be open to making our fully selfcontained, fully > documented, fully open source library extensible to use other libraries, > including native libraries, but I would not want to distribute anything > associated with external libraries. The reason for this is the > commitment made early on that all numerics and algorithms would be > immediately visible to the user  no chasing down external, possibly > incomplete or ambiguous docs to figure out what our code is doing. I have an additional reason for avoiding native libraries. Pure Java can be processed by external tools for either inspection (think findbugs, cobertura, traceability, auditing) or modification (think Nabla!). The Nabla case is especially important to me, but I am aware this is a cornercase. >> This page has some useful speed comparisons. For matrixmatrix >> multiply up >> to size 50, java is competitive. If you get up to roughly n = 500 or >> 1000, >> then heavily optimized native code can be up to 3x faster. Note the >> line in >> the third graph for colt (a reasonably well written pure java >> implementation) and MTJ (which is running in pure java mode here). In my >> case, I will generally opt for portability, but I like to have a portable >> option for speed. It is also important to remember that numerical codes >> more often need blinding speed than most other applications. >> > As Luc said, commonsmath aims to be a generalpurpose applied math > package implementing good, welldocumented, unencumbered numerical > algorithms. I think this can be done in Java and we are doing it. We > are never going to compete with optimized native code in speed, but > strong numerics, JRE improvements and Moore's law are rapidly shrinking > the class of realworld applications where the 3x difference above is > material. Perhaps we should have some benchmarks including our new linear package. Something more serious than my little experiement with QR decomposition. Unfortunately, I clearly have no time for it now. My current priotity is to publish 2.0 as soon as possible and I am already late on my own schedule. Luc > > Phil > > >> http://blog.mikiobraun.de/2009/04/somebenchmarknumbersforjblas.html >> >> Is that optional dependency really all that bad? >> >> On Fri, May 15, 2009 at 6:23 PM, Phil Steitz <[hidden email]> >> wrote: >> >> >>> [math] also has currently zero dependencies. We had two dependencies on >>> >>>> other commons components up to version 1.2 and removed them when we >>>> started work on version 2.0. Adding new dependencies, and especially >>>> dependencies that involve native libraries is a difficult decision that >>>> needs lots of discussion. We are currently trying to have 2.0 published >>>> very soon now, such a decision would delay the publication several >>>> months. >>>> >>>> >>>> >>> 1 for adding dependencies, especially on native code. Commons math >>> needs >>> to remain >>> >>> 1) ASL licensed >>> 2) selfcontained >>> 3) fully documented, full open source >>> >> >> >> >> >> > > >  > To unsubscribe, email: [hidden email] > For additional commands, email: [hidden email] > >  To unsubscribe, email: [hidden email] For additional commands, email: [hidden email] 
In reply to this post by Ted Dunning
I've asked Bjorn about an Apache license for MTJ and his reply was
"Yes, I don't see why not. The more users/developers, the better."

In reply to this post by Luc Maisonobe
I give a +1 for renaming SparseReal{Matrix, Vector}! These names should be reserved for interfaces (which might be methodless) indicating that the implementation storage needs to be sparse. Solved! See other message. Both myself and (more importantly, because he wrote MTJ) Bjorn are willing to use Apache license. MTJ depends only on netlibjava, *which does not depend on any native libs*. The option is there to add native optimised libs if the end user wants to. I'm going to call "foul" here :) The Java implementation of netlibjava is just as fast as machineoptimised BLAS/LAPACK... but only for matrices smaller than roughly 1000 x 1000 elements AND ONLY FOR NORMAL DESKTOPS MACHINES! The important distinction here is that hardware exists with crazy optimisations for the BLAS/LAPACK API and having the option to use that architecture from within Java is a great bonus. Consider, for example, a dedicated GPU (or FPGA) card which comes with a BLAS/LAPACK binary. Additionally, the BLAS/LAPACK API is universally accepted. It would be a mistake to attempt to reproduce all the brain power and agreement that has worked toward it. I am very keen to consolidate efforts! I think the next step is perhaps for you to have a look through the MTJ API and create a wishlist of everything you think would make sense to appear in commonsmath. Even if adopted "wholesale", I would still strongly recommend a review of the API. e.g. some interfaces extend Serializable (a mistake) ; I'm not entirely sure how relevant the distributed package is nowadays; the Matrix Market IO is difficult to understand/use ; there should perhaps be a "factory pattern" to instantiating matrices/vectors. In the meantime, I recommend holding off a 2.0 API release with any new linear classes. That way we can stabilise the "new" merged API... releasing that as part of 2.1. 
In reply to this post by Ted Dunning
Ted, thanks for pointing this out... I'd never seen it before. Glad MTJ did so well and I note that this isn't even with the optional native BLAS/LAPACK :)

In reply to this post by Luc Maisonobe
I've somehow missed much of this discussion, which has got a little confused. I'll repeat some key facts here:
 MTJ depends on netlibjava  I'm the maintainer of netlibjava  netlibjava depends on PURE JAVA code, generated by F2J from netlib.org BLAS/LAPACK (and ARPACK). Keith Seymour (author of f2j) deserves all the praise for that magnificent task! The necessary jar is distributed with netlibjava.  BLAS/LAPACK are industry standard APIs.  netlibjava is technically a "translation" of netlib.org's BLAS/LAPACK/ARPACK API, so is therefore BSD licensed  netlibjava can be *optionally* configured at runtime to use a native library instead of the Java implementation.  the java implementation is pretty damn fast and will be more than adequate for most users. However, it will *never* be as fast as native code running on specialist hardware (no matter how much the JVM improves). Being the maintainer of netlibjava, I'd be more than happy to relicense all the bits that aren't technically "translations" of netlib.org, for inclusion in commonsmath (in fact, it makes sense to do so). But you'd still need to depend on the f2j translated implementation. They are BSD license. Hell, it makes a *lot* of sense for commonsmath to provide the BLAS/LAPACK API... they are industry standards after all, and all reference implementations for linear algebra algorithms make use of them.

In reply to this post by Samuel Halliday
Just to let you know, I've contacted the author of this blog post... who has recently written a library called jblas. I've asked him if he wants to be involved with the initiative here, to consolidate efforts for Java Linear Algebra packages.
Incidentally... this blog post references a very pervasive, yet abandoned, project named Colt. Colt was a brilliant library in its day (now numerically challenged), although riddled with license issues (depending on noncommercial and illdefined notformilitaryuse middleware). Colt is a reminder of what can happen when a great library is written but not maintained. There might be lessons to learn from their API... I know some projects that use it. It might be worthwhile contacting other Java Linear Algebra package authors, such as JAMA. JAMA is a very small library in comparison (no additional functionality over MTJ or commonsmath)... but they might have a different take on APIs than we would have.

In reply to this post by Samuel Halliday
Sam Halliday a écrit :
> I've somehow missed much of this discussion, which has got a little confused. > I'll repeat some key facts here: > >  MTJ depends on netlibjava >  I'm the maintainer of netlibjava >  netlibjava depends on PURE JAVA code, generated by F2J from netlib.org > BLAS/LAPACK (and ARPACK). Keith Seymour (author of f2j) deserves all the > praise for that magnificent task! The necessary jar is distributed with > netlibjava. >  BLAS/LAPACK are industry standard APIs. >  netlibjava is technically a "translation" of netlib.org's > BLAS/LAPACK/ARPACK API, so is therefore BSD licensed >  netlibjava can be *optionally* configured at runtime to use a native > library instead of the Java implementation. >  the java implementation is pretty damn fast and will be more than adequate > for most users. However, it will *never* be as fast as native code running > on specialist hardware (no matter how much the JVM improves). > > Being the maintainer of netlibjava, I'd be more than happy to relicense > all the bits that aren't technically "translations" of netlib.org, for > inclusion in commonsmath (in fact, it makes sense to do so). But you'd > still need to depend on the f2j translated implementation. They are BSD > license. This is becoming more and more interesting. However, do yo think it would be possible to "include" the source (either manually written or automatically translated) into [math] ? This would allow a selfcontained package. We already provide some code which technically comes from translated netlib routines, for example part of the LevenbergMarquardt or almost everything in the singular value decomposition. The Netlib license allows that and we have set up the appropriate notices (see the javadoc and the NOTICE.txt file). > > Hell, it makes a *lot* of sense for commonsmath to provide the BLAS/LAPACK > API... they are industry standards after all, and all reference > implementations for linear algebra algorithms make use of them. I strongly approve that for BLAS. I dream of the BLAS API being mandatory in JVM implementations, but this will probably never happen. Considering LAPACK, I am less convinced because the API is strongly fortranoriented, not using some of the objectoriented features that are well suited for mathematical concepts. The algorithms and their implementations are very good, and we already use them inside, but with a different API. Luc > > > Luc Maisonobe wrote: >> I have an additional reason for avoiding native libraries. Pure Java can >> be processed by external tools for either inspection (think findbugs, >> cobertura, traceability, auditing) or modification (think Nabla!). The >> Nabla case is especially important to me, but I am aware this is a >> cornercase. >> >  To unsubscribe, email: [hidden email] For additional commands, email: [hidden email] 
In reply to this post by Luc Maisonobe
Luc Maisonobe wrote:
> Phil Steitz a écrit : > >> Ted Dunning wrote: >> >>> Phil, I think we have much of the same desires and motivations, but we >>> seem >>> to come to somewhat, but not entirely different conclusions. >>> >>> Assuming that (1) can be dealt with and assuming that (3) is already >>> dealt >>> with, do you still mind the inclusion of *optional*, automatically >>> generated >>> native code? >>> >>> >> Part of 3) is having full code available with the package for >> inspection. That is part of the reason that we have avoided external >> dependencies. I would be open to making our fully selfcontained, fully >> documented, fully open source library extensible to use other libraries, >> including native libraries, but I would not want to distribute anything >> associated with external libraries. The reason for this is the >> commitment made early on that all numerics and algorithms would be >> immediately visible to the user  no chasing down external, possibly >> incomplete or ambiguous docs to figure out what our code is doing. >> > > I have an additional reason for avoiding native libraries. Pure Java can > be processed by external tools for either inspection (think findbugs, > cobertura, traceability, auditing) or modification (think Nabla!). The > Nabla case is especially important to me, but I am aware this is a > cornercase. > > >>> This page has some useful speed comparisons. For matrixmatrix >>> multiply up >>> to size 50, java is competitive. If you get up to roughly n = 500 or >>> 1000, >>> then heavily optimized native code can be up to 3x faster. Note the >>> line in >>> the third graph for colt (a reasonably well written pure java >>> implementation) and MTJ (which is running in pure java mode here). In my >>> case, I will generally opt for portability, but I like to have a portable >>> option for speed. It is also important to remember that numerical codes >>> more often need blinding speed than most other applications. >>> >>> >> As Luc said, commonsmath aims to be a generalpurpose applied math >> package implementing good, welldocumented, unencumbered numerical >> algorithms. I think this can be done in Java and we are doing it. We >> are never going to compete with optimized native code in speed, but >> strong numerics, JRE improvements and Moore's law are rapidly shrinking >> the class of realworld applications where the 3x difference above is >> material. >> > > Perhaps we should have some benchmarks including our new linear package. > Something more serious than my little experiement with QR decomposition. > Unfortunately, I clearly have no time for it now. My current priotity is > to publish 2.0 as soon as possible and I am already late on my own schedule. > stay focussed on getting it out. Phil > Luc > > >> Phil >> >> >> >>> http://blog.mikiobraun.de/2009/04/somebenchmarknumbersforjblas.html >>> >>> Is that optional dependency really all that bad? >>> >>> On Fri, May 15, 2009 at 6:23 PM, Phil Steitz <[hidden email]> >>> wrote: >>> >>> >>> >>>> [math] also has currently zero dependencies. We had two dependencies on >>>> >>>> >>>>> other commons components up to version 1.2 and removed them when we >>>>> started work on version 2.0. Adding new dependencies, and especially >>>>> dependencies that involve native libraries is a difficult decision that >>>>> needs lots of discussion. We are currently trying to have 2.0 published >>>>> very soon now, such a decision would delay the publication several >>>>> months. >>>>> >>>>> >>>>> >>>>> >>>> 1 for adding dependencies, especially on native code. Commons math >>>> needs >>>> to remain >>>> >>>> 1) ASL licensed >>>> 2) selfcontained >>>> 3) fully documented, full open source >>>> >>>> >>> >>> >>> >>> >>  >> To unsubscribe, email: [hidden email] >> For additional commands, email: [hidden email] >> >> >> > > >  > To unsubscribe, email: [hidden email] > For additional commands, email: [hidden email] > >  To unsubscribe, email: [hidden email] For additional commands, email: [hidden email] 
In reply to this post by Luc Maisonobe
Luc, if the Apache team are happy to include source generated by f2j (which is therefore BSD license) then there is no reason at all to have a dependency!
The generator code from netlibjava need not be distributed as part of the final commonsmath binary, it is only needed to generate the .c files which allow for a native library at runtime. I would foresee the .c files being distributed as part of the commonsmath binary download, with instructions on how to build the optional native library. The entire mechanism for doing this is entirely up for debate and review. The important thing is that there be a standardised BLAS/LAPACK API available.

In reply to this post by Luc Maisonobe
I think netlibjava might actually be using the CLAPACK version of LAPACK... the biggest problem with C/Fortran is the array indexing is different for double[][]. CLAPACK addresses this.
LAPACK is still heavily used in reference implementations of standard algorithms, although admittedly not as *core* as BLAS. The ARPACK API is also worthwhile considering for inclusion (it's part of netlibjava and f2j's translations).

Free forum by Nabble  Edit this page 