[math] redesigning the optimization/estimation packages

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[math] redesigning the optimization/estimation packages

Luc Maisonobe
The optimization/estimation packages extracted from Mantissa raise many
usability issues. There have been several questions last months asking
for examples on how to use them. The fast is these packages were
originally written for a probably too specific task and the public API
is not easy to understand.

Jira issue MATH-177 (http://issues.apache.org/jira/browse/MATH-177) is a
reminder for one aspect of the problem and a recent contribution from
Gilles Sadowski linked to another aspect was attached to it (Gilles
provided an implementation of the Brent minimization algorithm).

I would like to solve this globally for 2.0.

The first step would be to rearrange slightly the analysis package. We
could have the main interfaces in the analysis package
(UnivariateRealFunction, DifferentiableUnivariateRealFunction ...) and
spread the other interfaces and classes in several subpackages:
 - analysis.solving
 - analysis.integration
 - analysis.interpolation

The second step would be to add an analysis.minimization package. The
contributed Brent minimization would get there, probably with a simple
gold section minimizer too. I think a dedicated interface
UnivariateRealMinimizer would be needed instead of reusing the
UnivariateRealSolver one, both interface have different goals. They
probably share many methods but not the underlying semantics.

The third step would be to handle multivariate or multivalued functions.
Should separate parallel packages be created or not ? The minimization
part for these kind of functions would come from the low level classes
of the estimation package and from the optimization package. The former
ones correspond to algorithms using derivatives and the latter
correspond to algorithms using only function values. The interfaces
should be simplified to remain low-level and deal directly with function
values, target, residuals.

A fourth step would be to build higher level abstractions on top of the
previous ones, providing the concepts of measurements, parameters  and
linear/non-linear model ... These are mainly the existing
WeightedMeasurement, EstimatedParameter and EstimationProblem classes
and interfaces.

Does this make sense ?
Luc

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [math] redesigning the optimization/estimation packages

Gilles Sadowski
Hi.

> The first step would be to rearrange slightly the analysis package. We
> could have the main interfaces in the analysis package
> (UnivariateRealFunction, DifferentiableUnivariateRealFunction ...) and
> spread the other interfaces and classes in several subpackages:
>  - analysis.solving
>  - analysis.integration
>  - analysis.interpolation
>
> The second step would be to add an analysis.minimization package. The
> contributed Brent minimization would get there, probably with a simple
> gold section minimizer too.

Could we have "root" instead of "solving"?
Also we could have "minimum" instead of "minimization".

I think that "root" is clearer (e.g. for a new user browsing the package
list javadoc).
[And both are little bit shorter names than the alternative. ;-)]

> I think a dedicated interface
> UnivariateRealMinimizer would be needed instead of reusing the
> UnivariateRealSolver one, both interface have different goals. They
> probably share many methods but not the underlying semantics.

I'd rather re-use as much code as possible.
Which methods do not have the same semantics?  The "getResult" method, for
example, can return a root location as well as a minimum location.

What about creating new interfaces and classes in the future "minimization"
(or "minimum") package that would extend those in "solving" (or "root")?

public interface UnivariateRealMinimizer extends UnivariateRealSolver {
  public double getValueAtMinimum();
}

public class UnivariateRealMinimizerImpl extends UnivariateRealSolverImpl
  implements UnivariateRealMinimizer {
  private double valueAtMinimum;
 
  protected void setResult(double x,
                           double fx,
                           double iterationCount) {
    setResult(x, iterationCount);
    valueAtMinimum = fx;
  }

  public double getValueAtMinimum() {
    return valueAtMinimum;
  }
}

> The third step would be to handle multivariate or multivalued functions.
> Should separate parallel packages be created or not ?

From a practical point of view, I'd think so: I'd guess that it is more
common to use the univariate algorithms, and so, it would be convenient not
to clutter those packages with classes that are useful only in the
multi-dimensional case.
Also, from what you say below, it seems that it will be much more work to
refactor the framework for the multi-dimensional case, whereas the packages
in the one-dimensional case could be "stabilized" sooner.

> The minimization
> part for these kind of functions would come from the low level classes
> of the estimation package and from the optimization package. The former
> ones correspond to algorithms using derivatives and the latter
> correspond to algorithms using only function values. The interfaces
> should be simplified to remain low-level and deal directly with function
> values, target, residuals.
>
> A fourth step would be to build higher level abstractions on top of the
> previous ones, providing the concepts of measurements, parameters  and
> linear/non-linear model ... These are mainly the existing
> WeightedMeasurement, EstimatedParameter and EstimationProblem classes
> and interfaces.

We already use several librairies: "jMinuit" in Java and "Opt++" in C++
(using JNI). Surely I'd prefer everything to be in Commons-Math...

Best,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [math] redesigning the optimization/estimation packages

Luc Maisonobe
Gilles Sadowski a écrit :

> Hi.
>
>> The first step would be to rearrange slightly the analysis package. We
>> could have the main interfaces in the analysis package
>> (UnivariateRealFunction, DifferentiableUnivariateRealFunction ...) and
>> spread the other interfaces and classes in several subpackages:
>>  - analysis.solving
>>  - analysis.integration
>>  - analysis.interpolation
>>
>> The second step would be to add an analysis.minimization package. The
>> contributed Brent minimization would get there, probably with a simple
>> gold section minimizer too.
>
> Could we have "root" instead of "solving"?
> Also we could have "minimum" instead of "minimization".
>
> I think that "root" is clearer (e.g. for a new user browsing the package
> list javadoc).
> [And both are little bit shorter names than the alternative. ;-)]

I am OK with any suggestion like this. Since I'm not a native english
speaker, I let some time for other people to express their feelings
about this.

>
>> I think a dedicated interface
>> UnivariateRealMinimizer would be needed instead of reusing the
>> UnivariateRealSolver one, both interface have different goals. They
>> probably share many methods but not the underlying semantics.
>
> I'd rather re-use as much code as possible.
> Which methods do not have the same semantics?  The "getResult" method, for
> example, can return a root location as well as a minimum location.

Yes, but a root is not a minimum location. For me, this is a sufficient
reason to separate interfaces. From a user point of view, knowing that
SolverXyz implements UnivariateRealSolver is sufficient to understand
what solve mean and what are the properties of the returned value. They
would not be the same (even if we reuse the name "solve") if the class
implement UnivariateRealMinimizer.

This is especially true for Brent solver. Two algorithms exist with the
same name, one for root one for minimum. I guess some users may get
confused if we would link too tightly the two interfaces. They could mix
things up and compute a root when they really want to compute a minimum
or the other way round.

I agree with code reuse, though. Perhaps we could have a very generic
superinterface for the shared parts like convergence settings or max
iterations and things like that. But the "solve" methods should be
separate, and even probably have different names to stay on the safe side.

>
> What about creating new interfaces and classes in the future "minimization"
> (or "minimum") package that would extend those in "solving" (or "root")?
>
> public interface UnivariateRealMinimizer extends UnivariateRealSolver {
>   public double getValueAtMinimum();
> }
>
> public class UnivariateRealMinimizerImpl extends UnivariateRealSolverImpl
>   implements UnivariateRealMinimizer {
>   private double valueAtMinimum;
>  
>   protected void setResult(double x,
>                            double fx,
>                            double iterationCount) {
>     setResult(x, iterationCount);
>     valueAtMinimum = fx;
>   }
>
>   public double getValueAtMinimum() {
>     return valueAtMinimum;
>   }
> }

I would prefer something like:

 public interface ConvergenceAlgorithm {
    void setMaximalIterationCount(int count);
    ...
 }

 public interface UnivariateRealSolver extends ConvergenceAlgorithm {
    double solve(UnivariateRealFunction f, ...);
 }

 public interface UnivariateRealMinimizer extends ConvergenceAlgorithm {
    double findMinimum(UnivariateRealFunction f, ...);
 }

>
>> The third step would be to handle multivariate or multivalued functions.
>> Should separate parallel packages be created or not ?
>
>>From a practical point of view, I'd think so: I'd guess that it is more
> common to use the univariate algorithms, and so, it would be convenient not
> to clutter those packages with classes that are useful only in the
> multi-dimensional case.
> Also, from what you say below, it seems that it will be much more work to
> refactor the framework for the multi-dimensional case, whereas the packages
> in the one-dimensional case could be "stabilized" sooner.

You are right, but we cannot postpone major change after 2.0. Once 2.0
will be out, we will try as much as possible to preserve compatibility
for users, so later major change would have to wait for 3.0. Now is
proper time to revamp everything we need.

I expect to have almost one month off work next month due to forced rest
after minor surgery operation. I will have plenty of time to kill then.

>
>> The minimization
>> part for these kind of functions would come from the low level classes
>> of the estimation package and from the optimization package. The former
>> ones correspond to algorithms using derivatives and the latter
>> correspond to algorithms using only function values. The interfaces
>> should be simplified to remain low-level and deal directly with function
>> values, target, residuals.
>>
>> A fourth step would be to build higher level abstractions on top of the
>> previous ones, providing the concepts of measurements, parameters  and
>> linear/non-linear model ... These are mainly the existing
>> WeightedMeasurement, EstimatedParameter and EstimationProblem classes
>> and interfaces.
>
> We already use several librairies: "jMinuit" in Java and "Opt++" in C++
> (using JNI). Surely I'd prefer everything to be in Commons-Math...

Our current optimization part is still fairly basic, so for now it would
be safe to stay with those libraries. What is implemented does work and
works well, but there are not many algorithms and the API organization
is really awkward for now. Improving the organization to have a stable
framework is a goal for 2.0. Adding new algorithms and features to
become a decent alternative to these libraries is a goal for 2.x, except
if new motivated contributors step in ...

Luc

>
> Best,
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [math] redesigning the optimization/estimation packages

Ted Dunning
The standard term in mathematical software is solver for the object that
does it, solve for the method it does and root or result for the final
output.

(and I am a native English speaker)

On Wed, Jan 14, 2009 at 12:08 PM, Luc Maisonobe <[hidden email]>wrote:

> > I think that "root" is clearer (e.g. for a new user browsing the package
> > list javadoc).
> > [And both are little bit shorter names than the alternative. ;-)]
>
> I am OK with any suggestion like this. Since I'm not a native english
> speaker, I let some time for other people to express their feelings
> about this.




--
Ted Dunning, CTO
DeepDyve
4600 Bohannon Drive, Suite 220
Menlo Park, CA 94025
www.deepdyve.com
650-324-0110, ext. 738
858-414-0013 (m)
Reply | Threaded
Open this post in threaded view
|

Re: [math] redesigning the optimization/estimation packages

Luc Maisonobe
Ted Dunning a écrit :
> The standard term in mathematical software is solver for the object that
> does it, solve for the method it does and root or result for the final
> output.
Thanks for the hint, Ted.
Hence I guess for the package name, "solvers", "solving", "root"/"roots"
are all acceptable.
For the other package "minimizers" is perhaps awkward but "minimization"
or "minimum"/"minima" are also acceptable.

Lets go for "root" and "minimum" as suggested by Gilles.

Luc

>
> (and I am a native English speaker)
>
> On Wed, Jan 14, 2009 at 12:08 PM, Luc Maisonobe
> <[hidden email]>wrote:
>
> >> I think that "root" is clearer (e.g. for a new user browsing the
> package
> >> list javadoc).
> >> [And both are little bit shorter names than the alternative. ;-)]
> > I am OK with any suggestion like this. Since I'm not a native english
> > speaker, I let some time for other people to express their feelings
> > about this.
>
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [math] redesigning the optimization/estimation packages

Ted Dunning
Actually, not quite.  I think minimizers or minimization are good for that
package, minimum is not.

Likewise, solvers for the package, but not root.

A minimum or a root is something you find, not software.

On Wed, Jan 14, 2009 at 2:01 PM, Luc Maisonobe <[hidden email]>wrote:

> Ted Dunning a écrit :
> > The standard term in mathematical software is solver for the object that
> > does it, solve for the method it does and root or result for the final
> > output.
> Thanks for the hint, Ted.
> Hence I guess for the package name, "solvers", "solving", "root"/"roots"
> are all acceptable.
> For the other package "minimizers" is perhaps awkward but "minimization"
> or "minimum"/"minima" are also acceptable.
>
> Lets go for "root" and "minimum" as suggested by Gilles.
>
> Luc
>
> >
> > (and I am a native English speaker)
> >
> > On Wed, Jan 14, 2009 at 12:08 PM, Luc Maisonobe
> > <[hidden email]>wrote:
> >
> > >> I think that "root" is clearer (e.g. for a new user browsing the
> > package
> > >> list javadoc).
> > >> [And both are little bit shorter names than the alternative. ;-)]
> > > I am OK with any suggestion like this. Since I'm not a native english
> > > speaker, I let some time for other people to express their feelings
> > > about this.
> >
> >
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


--
Ted Dunning, CTO
DeepDyve
4600 Bohannon Drive, Suite 220
Menlo Park, CA 94025
www.deepdyve.com
650-324-0110, ext. 738
858-414-0013 (m)
Reply | Threaded
Open this post in threaded view
|

Re: [math] redesigning the optimization/estimation packages

Gilles Sadowski
Hello.

> > > The standard term in mathematical software is solver for the object that
> > > does it, solve for the method it does and root or result for the final
> > > output.

In my opinion, "rootfinders" and "minimumfinders" are clearer; hence the
suggestion "root" and "minimum" for a short but unambiguous Java package
name.

Also, I would think that, in mathematics (perhaps even more in software),
there are many things called "solver". ;-)
The package naming is supposed to help the user find its way towards the
classes (s)he is looking for.

From another point-of-view, I always wonder how helpful it is to have a
package named "solver" (for example) and find that it contains classes
like "FooSolver" and "BarSolver". Why not just have "Foo" and "Bar" then?
I know that this could possibly lead to names clashing; hence I actually
don't suggest to remove the "Solver" suffix for the classes in that package.
But already having that (mathematically correct) suffix, for the classes and
interfaces, enables the developer to provide an *additional* hint to the
user, with the package name.


Best,
Gilles

P.S. I wouldn't fight over these name changes but since the refactoring was
     already suggested, I thought that this argument might have some value.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [math] redesigning the optimization/estimation packages

Gilles Sadowski
Hi.

> In my opinion, "rootfinders" and "minimumfinders" are clearer; hence the
> suggestion "root" and [...]
              ^^^^
An alternative name could also be "zero".

Best,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [math] redesigning the optimization/estimation packages

Gilles Sadowski
In reply to this post by Luc Maisonobe
Hello.

> I agree with code reuse, though. Perhaps we could have a very generic
> superinterface for the shared parts like convergence settings or max
> iterations and things like that. But the "solve" methods should be
> separate, and even probably have different names to stay on the safe side.
>
> > [...]
>
> I would prefer something like:
>
>  public interface ConvergenceAlgorithm {
>     void setMaximalIterationCount(int count);
>     ...
>  }
>
>  public interface UnivariateRealSolver extends ConvergenceAlgorithm {
>     double solve(UnivariateRealFunction f, ...);
>  }
>
>  public interface UnivariateRealMinimizer extends ConvergenceAlgorithm {
>     double findMinimum(UnivariateRealFunction f, ...);
>  }

Agreed.

> > [...] it will be much more work to
> > refactor the framework for the multi-dimensional case, whereas the packages
> > in the one-dimensional case could be "stabilized" sooner.
>
> You are right, but we cannot postpone major change after 2.0. Once 2.0
> will be out, we will try as much as possible to preserve compatibility
> for users, so later major change would have to wait for 3.0. Now is
> proper time to revamp everything we need.

That's certainly fine with me.
[I didn't mean to delay the refactoring, just that the packages for the one-
and for the multi-dimensional cases should preferably be separate.]

> I expect to have almost one month off work next month due to forced rest
> after minor surgery operation. I will have plenty of time to kill then.

Good luck, and best wishes for the recovery.

> > We already use several librairies: "jMinuit" in Java and "Opt++" in C++
> > (using JNI). Surely I'd prefer everything to be in Commons-Math...
>
> Our current optimization part is still fairly basic, so for now it would
> be safe to stay with those libraries. What is implemented does work and
> works well, but there are not many algorithms and the API organization
> is really awkward for now. Improving the organization to have a stable
> framework is a goal for 2.0. Adding new algorithms and features to
> become a decent alternative to these libraries is a goal for 2.x, except
> if new motivated contributors step in ...

Our project needs an efficient (most importantly with a minimal number of
calls to the evaluation function) multi-dimensional optimizer, and several
alternatives must be tested, so you can count me in for helping with the
Java implementation.

Best,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [math] redesigning the optimization/estimation packages

Phil Steitz
In reply to this post by Ted Dunning
Ted Dunning wrote:
> Actually, not quite.  I think minimizers or minimization are good for that
> package, minimum is not.
>
> Likewise, solvers for the package, but not root.
>
> A minimum or a root is something you find, not software.
>  
+1 for solvers.FooSolver.solve -> root,  
minimization.FooMinimizer.minimize -> minimum
by Ted's reasoning, wth "Foo" as descriptive as possible and
class/interface hierarchy as simple as possible.

Phil

> On Wed, Jan 14, 2009 at 2:01 PM, Luc Maisonobe <[hidden email]>wrote:
>
>  
>> Ted Dunning a écrit :
>>    
>>> The standard term in mathematical software is solver for the object that
>>> does it, solve for the method it does and root or result for the final
>>> output.
>>>      
>> Thanks for the hint, Ted.
>> Hence I guess for the package name, "solvers", "solving", "root"/"roots"
>> are all acceptable.
>> For the other package "minimizers" is perhaps awkward but "minimization"
>> or "minimum"/"minima" are also acceptable.
>>
>> Lets go for "root" and "minimum" as suggested by Gilles.
>>
>> Luc
>>
>>    
>>> (and I am a native English speaker)
>>>
>>> On Wed, Jan 14, 2009 at 12:08 PM, Luc Maisonobe
>>> <[hidden email]>wrote:
>>>
>>>      
>>>>> I think that "root" is clearer (e.g. for a new user browsing the
>>>>>          
>>> package
>>>      
>>>>> list javadoc).
>>>>> [And both are little bit shorter names than the alternative. ;-)]
>>>>>          
>>>> I am OK with any suggestion like this. Since I'm not a native english
>>>> speaker, I let some time for other people to express their feelings
>>>> about this.
>>>>        
>>>
>>>
>>>      
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>>    
>
>
>  


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [math] redesigning the optimization/estimation packages

Ted Dunning
It is probably implied and probably bad form to say explicitly, but +1 for
my own suggestiong.

On Sat, Jan 17, 2009 at 5:56 AM, Phil Steitz <[hidden email]> wrote:

> Ted Dunning wrote:
>
>> Actually, not quite.  I think minimizers or minimization are good for that
>> package, minimum is not.
>>
>> Likewise, solvers for the package, but not root.
>>
>> A minimum or a root is something you find, not software.
>>
>>
> +1 for solvers.FooSolver.solve -> root,  minimization.FooMinimizer.minimize
> -> minimum
> by Ted's reasoning, wth "Foo" as descriptive as possible and
> class/interface hierarchy as simple as possible.
>
> Phil
>
>> On Wed, Jan 14, 2009 at 2:01 PM, Luc Maisonobe <[hidden email]
>> >wrote:
>>
>>
>>
>>> Ted Dunning a écrit :
>>>
>>>
>>>> The standard term in mathematical software is solver for the object that
>>>> does it, solve for the method it does and root or result for the final
>>>> output.
>>>>
>>>>
>>> Thanks for the hint, Ted.
>>> Hence I guess for the package name, "solvers", "solving", "root"/"roots"
>>> are all acceptable.
>>> For the other package "minimizers" is perhaps awkward but "minimization"
>>> or "minimum"/"minima" are also acceptable.
>>>
>>> Lets go for "root" and "minimum" as suggested by Gilles.
>>>
>>> Luc
>>>
>>>
>>>
>>>> (and I am a native English speaker)
>>>>
>>>> On Wed, Jan 14, 2009 at 12:08 PM, Luc Maisonobe
>>>> <[hidden email]>wrote:
>>>>
>>>>
>>>>
>>>>> I think that "root" is clearer (e.g. for a new user browsing the
>>>>>>
>>>>>>
>>>>> package
>>>>
>>>>
>>>>> list javadoc).
>>>>>> [And both are little bit shorter names than the alternative. ;-)]
>>>>>>
>>>>>>
>>>>> I am OK with any suggestion like this. Since I'm not a native english
>>>>> speaker, I let some time for other people to express their feelings
>>>>> about this.
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


--
Ted Dunning, CTO
DeepDyve
4600 Bohannon Drive, Suite 220
Menlo Park, CA 94025
www.deepdyve.com
650-324-0110, ext. 738
858-414-0013 (m)