[Math] LeastSquaresOptimizer Design

classic Classic list List threaded Threaded
30 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[Math] LeastSquaresOptimizer Design

ole ersoy
Wanted to float some ideas for the LeastSquaresOptimizer (Possibly General Optimizer) design.  For example with the LevenbergMarquardtOptimizer we would do:
`LevenbergMarquardtOptimizer.optimize(OptimizationContext c);`

Rough optimize() outline:
public static void optimise() {
//perform the optimization
//If successful
     c.notify(LevenberMarquardtResultsEnum.SUCCESS, solution);
//If not successful
c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_COST_RELATIVE_TOLERANCE, diagnostic);
//or
c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_PARAMETERS_RELATIVE_TOLERANCE, diagnostic)
//etc
}

The diagnostic, when turned on, will contain a trace of the last N iterations leading up to the failure.  When turned off, the Diagnostic instance only contains the parameters used to detect failure.  The diagnostic could be viewed as an indirect way to log optimizer iterations.

WDYT?

Cheers,
- Ole


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] LeastSquaresOptimizer Design

Gilles Sadowski
On Sun, 20 Sep 2015 01:12:49 -0500, Ole Ersoy wrote:

> Wanted to float some ideas for the LeastSquaresOptimizer (Possibly
> General Optimizer) design.  For example with the
> LevenbergMarquardtOptimizer we would do:
> `LevenbergMarquardtOptimizer.optimize(OptimizationContext c);`
>
> Rough optimize() outline:
> public static void optimise() {
> //perform the optimization
> //If successful
>     c.notify(LevenberMarquardtResultsEnum.SUCCESS, solution);
> //If not successful
>
> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_COST_RELATIVE_TOLERANCE,
> diagnostic);
> //or
>
> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_PARAMETERS_RELATIVE_TOLERANCE,
> diagnostic)
> //etc
> }
>
> The diagnostic, when turned on, will contain a trace of the last N
> iterations leading up to the failure.  When turned off, the
> Diagnostic
> instance only contains the parameters used to detect failure.  The
> diagnostic could be viewed as an indirect way to log optimizer
> iterations.
>
> WDYT?

I'm wary of having several different ways to convey information to the
caller. It seems that the reporting interfaces could quickly overwhelm
the "actual" code (one type of context per algorithm).

The current reporting is based on exceptions, and assumes that if no
exception was thrown, then the user's request completed successfully.
I totally agree that in some circumstances, more information on the
inner working of an algorithm would be quite useful.

But I don't see the point in devoting resources to reinvent the wheel:
I longed several times for the use of a logging library.
The only show-stopper has been the informal "no-dependency" policy...

Best rgards,
Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] LeastSquaresOptimizer Design

ole ersoy


On 09/20/2015 05:51 AM, Gilles wrote:

> On Sun, 20 Sep 2015 01:12:49 -0500, Ole Ersoy wrote:
>> Wanted to float some ideas for the LeastSquaresOptimizer (Possibly
>> General Optimizer) design.  For example with the
>> LevenbergMarquardtOptimizer we would do:
>> `LevenbergMarquardtOptimizer.optimize(OptimizationContext c);`
>>
>> Rough optimize() outline:
>> public static void optimise() {
>> //perform the optimization
>> //If successful
>>     c.notify(LevenberMarquardtResultsEnum.SUCCESS, solution);
>> //If not successful
>>
>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_COST_RELATIVE_TOLERANCE,
>> diagnostic);
>> //or
>>
>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_PARAMETERS_RELATIVE_TOLERANCE,
>> diagnostic)
>> //etc
>> }
>>
>> The diagnostic, when turned on, will contain a trace of the last N
>> iterations leading up to the failure.  When turned off, the Diagnostic
>> instance only contains the parameters used to detect failure. The
>> diagnostic could be viewed as an indirect way to log optimizer
>> iterations.
>>
>> WDYT?
>
> I'm wary of having several different ways to convey information to the
> caller.
It would just be one way.  But the caller may not be the receiver (It could be).  The receiver would be an observer attached to the OptimizationContext that implements an interface allowing it to observe the optimization.
> It seems that the reporting interfaces could quickly overwhelm
> the "actual" code (one type of context per algorithm).
There would one type of Observer interface per algorithm.  It would act on the solution and what are currently exceptions, although these would be translated into enums.
> The current reporting is based on exceptions, and assumes that if no
> exception was thrown, then the user's request completed successfully.
Sure - personally I'd much rather deal with something similar to an HTTP status code in a callback, than an exception .  I think the code is cleaner and the calback makes it more elegant to apply an adaptive approach to handling the response, like slightly relaxing constraints, convergence parameters, etc.  Also by getting rid of the exceptions, we no longer depend on the I18N layer that they are tied to and now the messages can be more informative, since they target the root cause.  The observer can also run in the 'main' thread' while the optimization can run asynchronously.  Also WRT JDK9 and modules, loosing the exceptions would mean one less dependency when the library is up into JDK9 modules...which would be more in line with this philosophy:
https://github.com/substack/browserify-handbook#module-philosophy

> I totally agree that in some circumstances, more information on the
> inner working of an algorithm would be quite useful.
... Algorithm iterations become unit testable.
>
> But I don't see the point in devoting resources to reinvent the wheel:
You mean pimping the wheel?  Big pimpin.
>
> I longed several times for the use of a logging library.
> The only show-stopper has been the informal "no-dependency" policy...
JDK9 Jigsaw should solve dependency hell, so the less coupling between commons math classes the better.  Anyways I'm obviously interested in playing with this stuff, so when I get something up into a repository I'll to do a callback :).

Cheers,
Ole


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] LeastSquaresOptimizer Design

Gilles Sadowski
Hi.

On Sun, 20 Sep 2015 15:04:08 -0500, Ole Ersoy wrote:

> On 09/20/2015 05:51 AM, Gilles wrote:
>> On Sun, 20 Sep 2015 01:12:49 -0500, Ole Ersoy wrote:
>>> Wanted to float some ideas for the LeastSquaresOptimizer (Possibly
>>> General Optimizer) design.  For example with the
>>> LevenbergMarquardtOptimizer we would do:
>>> `LevenbergMarquardtOptimizer.optimize(OptimizationContext c);`
>>>
>>> Rough optimize() outline:
>>> public static void optimise() {
>>> //perform the optimization
>>> //If successful
>>>     c.notify(LevenberMarquardtResultsEnum.SUCCESS, solution);
>>> //If not successful
>>>
>>>
>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_COST_RELATIVE_TOLERANCE,
>>> diagnostic);
>>> //or
>>>
>>>
>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_PARAMETERS_RELATIVE_TOLERANCE,
>>> diagnostic)
>>> //etc
>>> }
>>>
>>> The diagnostic, when turned on, will contain a trace of the last N
>>> iterations leading up to the failure.  When turned off, the
>>> Diagnostic
>>> instance only contains the parameters used to detect failure. The
>>> diagnostic could be viewed as an indirect way to log optimizer
>>> iterations.
>>>
>>> WDYT?
>>
>> I'm wary of having several different ways to convey information to
>> the
>> caller.
> It would just be one way.

One way for optimizer, one way for solvers, one way for ...

> But the caller may not be the receiver
> (It could be).  The receiver would be an observer attached to the
> OptimizationContext that implements an interface allowing it to
> observe
> the optimization.

I'm afraid that it will add to the questions of what to put in the
code and how.  [We already had sometimes heated discussions just for
the IMHO obvious (e.g. code formatting, documentation, exception...).]

>> It seems that the reporting interfaces could quickly overwhelm
>> the "actual" code (one type of context per algorithm).
> There would one type of Observer interface per algorithm.  It would
> act on the solution and what are currently exceptions, although these
> would be translated into enums.

Unless I'm mistaken, the most common use-case for codes implemented
in a library such as CM is to provide a correct answer or bail out
in a non-equivocal way.

It would make the code more involved to handle a minority of
(undefined) cases. [Actual examples would be welcome in order to
focus the discussion.]

>> The current reporting is based on exceptions, and assumes that if no
>> exception was thrown, then the user's request completed
>> successfully.
> Sure - personally I'd much rather deal with something similar to an
> HTTP status code in a callback, than an exception .  I think the code
> is cleaner and the calback makes it more elegant to apply an adaptive
> approach to handling the response, like slightly relaxing
> constraints,
> convergence parameters, etc.  Also by getting rid of the exceptions,
> we no longer depend on the I18N layer that they are tied to and now
> the messages can be more informative, since they target the root
> cause.  The observer can also run in the 'main' thread' while the
> optimization can run asynchronously.  Also WRT JDK9 and modules,
> loosing the exceptions would mean one less dependency when the
> library
> is up into JDK9 modules...which would be more in line with this
> philosophy:
> https://github.com/substack/browserify-handbook#module-philosophy

I'm not sure I fully understood the philosophy from the text in this
short paragraph.
But I do not agree with the idea that the possibility to quickly find
some code is more important than standards and best practices.

>> I totally agree that in some circumstances, more information on the
>> inner working of an algorithm would be quite useful.
> ... Algorithm iterations become unit testable.
>>
>> But I don't see the point in devoting resources to reinvent the
>> wheel:
> You mean pimping the wheel?  Big pimpin.

I think that logging statements are easy to add, not disruptive at all,
and come in handy to understand a code's unexpected behaviour.
Assuming that a "logging" feature is useful, it can be added *now*
using
a dependency towards a weight-less (!) framework such as "slf4j".
IMO, it would be a waste of time to implement a new communication layer
that can do that, and more, if it would be used for logging only in 99%
of the cases.

>>
>> I longed several times for the use of a logging library.
>> The only show-stopper has been the informal "no-dependency"
>> policy...
> JDK9 Jigsaw should solve dependency hell, so the less coupling
> between commons math classes the better.

I wouldn't call "coupling" the dependency towards exception classes:
they are little utilities that can make sense in various parts of the
library.

[Unless one wants to embark on yet another discussion about exceptions;
whether there should be one class for each of the "messages" that exist
in "LocalizedFormats"; whether localization should be done in CM;
etc.]

> Anyways I'm obviously
> interested in playing with this stuff, so when I get something up
> into
> a repository I'll to do a callback :).

If you are interested in big overhauls, there is one that gathered
relative consensus: rewrite the algorithms in a "multithread-friendly"
way.
Some ideas were floated (cf. ML archive) but no implementation or
experiment...  Perhaps with a well-defined goal such as performance
improvement, your design suggestions will become clearer to more
people.

AFAIK, only the classes in the "o.a.c.m.neuralnet" package are
currently
ready to be used with the "java.util.concurrent" framework.


Best regards,
Gilles

>
> Cheers,
> Ole
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] LeastSquaresOptimizer Design

ole ersoy
Hola,

On 09/21/2015 04:15 PM, Gilles wrote:

> Hi.
>
> On Sun, 20 Sep 2015 15:04:08 -0500, Ole Ersoy wrote:
>> On 09/20/2015 05:51 AM, Gilles wrote:
>>> On Sun, 20 Sep 2015 01:12:49 -0500, Ole Ersoy wrote:
>>>> Wanted to float some ideas for the LeastSquaresOptimizer (Possibly
>>>> General Optimizer) design.  For example with the
>>>> LevenbergMarquardtOptimizer we would do:
>>>> `LevenbergMarquardtOptimizer.optimize(OptimizationContext c);`
>>>>
>>>> Rough optimize() outline:
>>>> public static void optimise() {
>>>> //perform the optimization
>>>> //If successful
>>>>     c.notify(LevenberMarquardtResultsEnum.SUCCESS, solution);
>>>> //If not successful
>>>>
>>>>
>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_COST_RELATIVE_TOLERANCE,
>>>> diagnostic);
>>>> //or
>>>>
>>>>
>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_PARAMETERS_RELATIVE_TOLERANCE,
>>>> diagnostic)
>>>> //etc
>>>> }
>>>>
>>>> The diagnostic, when turned on, will contain a trace of the last N
>>>> iterations leading up to the failure.  When turned off, the Diagnostic
>>>> instance only contains the parameters used to detect failure. The
>>>> diagnostic could be viewed as an indirect way to log optimizer
>>>> iterations.
>>>>
>>>> WDYT?
>>>
>>> I'm wary of having several different ways to convey information to the
>>> caller.
>> It would just be one way.
>
> One way for optimizer, one way for solvers, one way for ...

Yes I see what you mean, but I think on a whole it will be worth it to add additional sugar code that removes the need for exceptions.

>
>> But the caller may not be the receiver
>> (It could be).  The receiver would be an observer attached to the
>> OptimizationContext that implements an interface allowing it to observe
>> the optimization.
>
> I'm afraid that it will add to the questions of what to put in the
> code and how.  [We already had sometimes heated discussions just for
> the IMHO obvious (e.g. code formatting, documentation, exception...).]

Hehe.  Yes I remember some of these discussions.  I wonder how much time was spent debating the exceptions alone?  Surely everyone must have had this feeling in pit of their stomach that there's got to be a better way.  On the exception topic, these are some of the issues:

I18N
===================
If you are new to commons math and thinking about designing a commons math compatible exception you should probably understand the I18N stuff that's bound to exception (and wonder why it's bound the the exception).  Grab a coffee and spend a few hours, unless you are obviously fairly new to Java like some ofthe people posting for help.  In this case when the exception occurs, there is going to be a lot of tutoring going on on the users list.

Number of Exceptions
===================
Before you do actually design a new exception, you should probably see if there is an exception that already fits the category of what you are doing.  So you start reading.  Exception1...nop Exception2...nop...Exception3...Exception999..But I think I'm getting warmer.  OK - Did not find it ... but I'm fairly certain that there is a elegant place for it somewhere in the exception hierarchy...


Handling of Exceptions
===================
If our app uses several of the commons math classes (That throw exceptions of the same type), and one of those classes throws an exception,what is the app supposed to do?

I think most developers would find that question somewhat challenging.  There are numerous strategies.  Catch all exceptions and log what happened, etc.  But what if the requirement is that if an exception is thrown, the organization that receives it has 0 seconds to get to the root cause of it and understand the dynamics. Is this doable?  (Yes obviously, but how hard is it...?).


>>> It seems that the reporting interfaces could quickly overwhelm
>>> the "actual" code (one type of context per algorithm).
>> There would one type of Observer interface per algorithm.  It would
>> act on the solution and what are currently exceptions, although these
>> would be translated into enums.
>
> Unless I'm mistaken, the most common use-case for codes implemented
> in a library such as CM is to provide a correct answer or bail out
> in a non-equivocal way.
Most java developers are used to synchronous coding...call the method get the response...catch the exception if needed.  This is changing with JDK8, and as we evolve and start using lambdas, we become more accustomed to the functional callback style of programming.  Personally I want to be able to use an API that gives me what I need when everything works as expected, allows me to resolve unexpected issues with minimal effort, and is as simple, fluid, and lightweight as possible.

>
> It would make the code more involved to handle a minority of
> (undefined) cases. [Actual examples would be welcome in order to
> focus the discussion.]

Rough Outline (I've evolved the concept and moved away from the OptimizationContext in the process of writing):

interface LevenbergMarquardtObserver {

     public void hola(Solution s);
     public void sugarHoneyIceTea(ResultType rt, Dianostics d);
}

public class LMObserver implements LevenbergMarquardtObserver {

    private Application application;

    public LMObserver(Application application) {
        this.application = application;
    }

    public void hola(ResultType rt, Solution s) {
                 application.next(solution);
    }

    public void sugarHoneyIceTea(ResultType rt, Diagnostic s)
        if (rt == ResultType.I_GOT_THIS_ONE) {
             //I looked at the commons unit tests for this algorithm evaluating
             //the diagnostics that shows how this failure can occur
             //I'm totally fixing this!  Steps aside!
        }
        else if (rt == ResultType.REALLY_COMPLICATED_STUFF)
        {
            //We need our best engineers...call India.
        }
   )


public class Application {
     //Note nothing is returned.
     LevenberMarquardtOptimizer.setOberver(new LMObserver(this)).setLeastSquaresProblem(new ClassThatImplementsTheProblem())).start();

     public void next(Solution solution) {

         //Do cool stuff.

     }
}

Or an asynchronous variation:

public class Application {
//This call will not block because async is true
     LevenberMarquardtOptimizer.setAsync(true).setOberver(new LMObserver()).setLeastSquaresProblem(new ClassThatImplementsTheProblem())).start();

     //Do more stuff right away.

     public void next(Solution solution) {
         //When the thread running the optimization is done, this method is called back.
         //Do whatever comes next
     }
}

The above would start the optimization in a separate thread that does not / SHOULD NOT share data with the main thread.

>
>>> The current reporting is based on exceptions, and assumes that if no
>>> exception was thrown, then the user's request completed successfully.
>> Sure - personally I'd much rather deal with something similar to an
>> HTTP status code in a callback, than an exception .  I think the code
>> is cleaner and the calback makes it more elegant to apply an adaptive
>> approach to handling the response, like slightly relaxing constraints,
>> convergence parameters, etc.  Also by getting rid of the exceptions,
>> we no longer depend on the I18N layer that they are tied to and now
>> the messages can be more informative, since they target the root
>> cause.  The observer can also run in the 'main' thread' while the
>> optimization can run asynchronously.  Also WRT JDK9 and modules,
>> loosing the exceptions would mean one less dependency when the library
>> is up into JDK9 modules...which would be more in line with this
>> philosophy:
>> https://github.com/substack/browserify-handbook#module-philosophy
>
> I'm not sure I fully understood the philosophy from the text in this
> short paragraph.
> But I do not agree with the idea that the possibility to quickly find
> some code is more important than standards and best practices.

If you go to npmjs.org and type in Neural Network you will get 56 results all linked to github repositories.

In addition there's meta data indicating number of downloads in the last day, last month, etc.  Try typing in cosine.  Odds are you will find a package that does just want you want and nothing else.  This is very underwhelming and refreshing in terms of cloning off of github and getting familar with tests etc.  Also eye opening.  How many of us knew that we could do that much stuff with cosine! :).

>
>>> I totally agree that in some circumstances, more information on the
>>> inner working of an algorithm would be quite useful.
>> ... Algorithm iterations become unit testable.
>>>
>>> But I don't see the point in devoting resources to reinvent the wheel:
>> You mean pimping the wheel?  Big pimpin.
>
> I think that logging statements are easy to add, not disruptive at all,
> and come in handy to understand a code's unexpected behaviour.
> Assuming that a "logging" feature is useful, it can be added *now* using
> a dependency towards a weight-less (!) framework such as "slf4j".
> IMO, it would be a waste of time to implement a new communication layer
> that can do that, and more, if it would be used for logging only in 99%
> of the cases.
SLF4J is used by almost every other framework, so why not use it? Logging and the diagnostic could be used together.  The primary purpose of the diagnostic though is to collect data that will be useful in `sugarHoneyIceTea`.

>
>>>
>>> I longed several times for the use of a logging library.
>>> The only show-stopper has been the informal "no-dependency" policy...
>> JDK9 Jigsaw should solve dependency hell, so the less coupling
>> between commons math classes the better.
>
> I wouldn't call "coupling" the dependency towards exception classes:
> they are little utilities that can make sense in various parts of the
> library.

If for example the Simplex solver is broken off into it's own module, then it has to be coupled to the exceptions, unless it is exception free.

>
> [Unless one wants to embark on yet another discussion about exceptions;
> whether there should be one class for each of the "messages" that exist
> in "LocalizedFormats"; whether localization should be done in CM;
> etc.]

I think it would be best to just eliminate the exceptions.

>
>> Anyways I'm obviously
>> interested in playing with this stuff, so when I get something up into
>> a repository I'll to do a callback :).
>
> If you are interested in big overhauls, there is one that gathered
> relative consensus: rewrite the algorithms in a "multithread-friendly"
> way.
I think that's a tall order that will take us into JDK88 :).  But using callbacks and making potentially long running computations asynchronous could be a middle ground that would allow simple multi threaded use without fiddling around under the hood...

>
> Some ideas were floated (cf. ML archive) but no implementation or
> experiment...  Perhaps with a well-defined goal such as performance
> improvement, your design suggestions will become clearer to more people.
>
> AFAIK, only the classes in the "o.a.c.m.neuralnet" package are currently
> ready to be used with the "java.util.concurrent" framework.
FWIU Neural Nets are a great fit for concurrency.  I think for the others we will end up having discussions around how users would control the number of threads, etc. again that makes some of us nervous.  An asynchronous operation that runs in one separate thread is easier to reason about.  If we want to test 10 neural net configurations, and we have 10 cores, then we can start each by itself by doing something like:

Nework.setAsync(true).addNeurons().connectNeurons().addObserver(observer).start().
//Now do 10 more
//If the observer is shared then notifications should be thread safe.

Cheers,
- Ole

P.S. Dang that was a long email.  If I write one more of these, ban me :)

>
>
> Best regards,
> Gilles
>
>>
>> Cheers,
>> Ole
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] LeastSquaresOptimizer Design

Gilles Sadowski
Hi.

On Mon, 21 Sep 2015 19:55:15 -0500, Ole Ersoy wrote:

> Hola,
>
> On 09/21/2015 04:15 PM, Gilles wrote:
>> Hi.
>>
>> On Sun, 20 Sep 2015 15:04:08 -0500, Ole Ersoy wrote:
>>> On 09/20/2015 05:51 AM, Gilles wrote:
>>>> On Sun, 20 Sep 2015 01:12:49 -0500, Ole Ersoy wrote:
>>>>> Wanted to float some ideas for the LeastSquaresOptimizer
>>>>> (Possibly
>>>>> General Optimizer) design.  For example with the
>>>>> LevenbergMarquardtOptimizer we would do:
>>>>> `LevenbergMarquardtOptimizer.optimize(OptimizationContext c);`
>>>>>
>>>>> Rough optimize() outline:
>>>>> public static void optimise() {
>>>>> //perform the optimization
>>>>> //If successful
>>>>>     c.notify(LevenberMarquardtResultsEnum.SUCCESS, solution);
>>>>> //If not successful
>>>>>
>>>>>
>>>>>
>>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_COST_RELATIVE_TOLERANCE,
>>>>> diagnostic);
>>>>> //or
>>>>>
>>>>>
>>>>>
>>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_PARAMETERS_RELATIVE_TOLERANCE,
>>>>> diagnostic)
>>>>> //etc
>>>>> }
>>>>>
>>>>> The diagnostic, when turned on, will contain a trace of the last
>>>>> N
>>>>> iterations leading up to the failure.  When turned off, the
>>>>> Diagnostic
>>>>> instance only contains the parameters used to detect failure. The
>>>>> diagnostic could be viewed as an indirect way to log optimizer
>>>>> iterations.
>>>>>
>>>>> WDYT?
>>>>
>>>> I'm wary of having several different ways to convey information to
>>>> the
>>>> caller.
>>> It would just be one way.
>>
>> One way for optimizer, one way for solvers, one way for ...
>
> Yes I see what you mean, but I think on a whole it will be worth it
> to add additional sugar code that removes the need for exceptions.

Isn't always possible to wrap exception-generating code so that upper
layers do not see them?
The interface would have to know how to handle them and propagate the
information in some other form (callback).

>>
>>> But the caller may not be the receiver
>>> (It could be).  The receiver would be an observer attached to the
>>> OptimizationContext that implements an interface allowing it to
>>> observe
>>> the optimization.
>>
>> I'm afraid that it will add to the questions of what to put in the
>> code and how.  [We already had sometimes heated discussions just for
>> the IMHO obvious (e.g. code formatting, documentation,
>> exception...).]
>
> Hehe.  Yes I remember some of these discussions.  I wonder how much
> time was spent debating the exceptions alone?  Surely everyone must
> have had this feeling in pit of their stomach that there's got to be
> a
> better way.  On the exception topic, these are some of the issues:
>
> I18N
> ===================
> If you are new to commons math and thinking about designing a commons
> math compatible exception you should probably understand the I18N
> stuff that's bound to exception (and wonder why it's bound the the
> exception).  Grab a coffee and spend a few hours, unless you are
> obviously fairly new to Java like some ofthe people posting for help.
> In this case when the exception occurs, there is going to be a lot of
> tutoring going on on the users list.

I already said all I had to say about this; it's in the archive.
Summary: I agree that it shouldn't be here.

> Number of Exceptions
> ===================
> Before you do actually design a new exception, you should probably
> see if there is an exception that already fits the category of what
> you are doing.  So you start reading.  Exception1...nop
> Exception2...nop...Exception3...Exception999..But I think I'm getting
> warmer.  OK - Did not find it ... but I'm fairly certain that there
> is
> a elegant place for it somewhere in the exception hierarchy...

On this, I also explained at length my views (assuming that exceptions
are part of the design).
Summary: an exception indicates that something went wrong, and the
caller
should not hope to get anything good out of the call that raised the
exception (i.e. he _must_ craft another call that meets the
requirements
of the code).

> Handling of Exceptions
> ===================
> If our app uses several of the commons math classes (That throw
> exceptions of the same type), and one of those classes throws an
> exception,what is the app supposed to do?

Cf. previous paragraph.

> I think most developers would find that question somewhat
> challenging.  There are numerous strategies.  Catch all exceptions
> and
> log what happened, etc.  But what if the requirement is that if an
> exception is thrown, the organization that receives it has 0 seconds
> to get to the root cause of it and understand the dynamics. Is this
> doable?  (Yes obviously, but how hard is it...?).

Cf. previous paragraph.
In effect, you describe an upper layer's requirement (handling an
expected
"unexpected(!) failure"). IMHO, it's out CM's realm (CM raises the
exception,
end of story).

>>>> It seems that the reporting interfaces could quickly overwhelm
>>>> the "actual" code (one type of context per algorithm).
>>> There would one type of Observer interface per algorithm.  It would
>>> act on the solution and what are currently exceptions, although
>>> these
>>> would be translated into enums.
>>
>> Unless I'm mistaken, the most common use-case for codes implemented
>> in a library such as CM is to provide a correct answer or bail out
>> in a non-equivocal way.
> Most java developers are used to synchronous coding...call the method
> get the response...catch the exception if needed.  This is changing
> with JDK8, and as we evolve and start using lambdas, we become more
> accustomed to the functional callback style of programming.
> Personally I want to be able to use an API that gives me what I need
> when everything works as expected, allows me to resolve unexpected
> issues with minimal effort, and is as simple, fluid, and lightweight
> as possible.

I've not yet used Java 8; I would have if we were allowed to use it in
CM...

However, I'm not convinced that asynchronicity should be dealt with
at the CM level, beyond making its algorithms multi-thread friendly.
IMO, this is the important change (that can make a big difference,
performance-wise, on machines with multiple cores).
Then developers can use the standard tools in "java.util.concurrent"
to select a runtime policy (single/multi-thread and/or (a)synchronous).

>> It would make the code more involved to handle a minority of
>> (undefined) cases. [Actual examples would be welcome in order to
>> focus the discussion.]
>
> Rough Outline (I've evolved the concept and moved away from the
> OptimizationContext in the process of writing):
>
> interface LevenbergMarquardtObserver {
>
>     public void hola(Solution s);
>     public void sugarHoneyIceTea(ResultType rt, Dianostics d);
> }
>
> public class LMObserver implements LevenbergMarquardtObserver {
>
>    private Application application;
>
>    public LMObserver(Application application) {
>        this.application = application;
>    }
>
>    public void hola(ResultType rt, Solution s) {
>                 application.next(solution);
>    }
>
>    public void sugarHoneyIceTea(ResultType rt, Diagnostic s)
>        if (rt == ResultType.I_GOT_THIS_ONE) {
>             //I looked at the commons unit tests for this algorithm
> evaluating
>             //the diagnostics that shows how this failure can occur
>             //I'm totally fixing this!  Steps aside!
>        }
>        else if (rt == ResultType.REALLY_COMPLICATED_STUFF)
>        {
>            //We need our best engineers...call India.
>        }
>   )
>
>
> public class Application {
>     //Note nothing is returned.
>     LevenberMarquardtOptimizer.setOberver(new
> LMObserver(this)).setLeastSquaresProblem(new
> ClassThatImplementsTheProblem())).start();
>
>     public void next(Solution solution) {
>
>         //Do cool stuff.
>
>     }
> }
>
> Or an asynchronous variation:
>
> public class Application {
> //This call will not block because async is true
>     LevenberMarquardtOptimizer.setAsync(true).setOberver(new
> LMObserver()).setLeastSquaresProblem(new
> ClassThatImplementsTheProblem())).start();
>
>     //Do more stuff right away.
>
>     public void next(Solution solution) {
>         //When the thread running the optimization is done, this
> method is called back.
>         //Do whatever comes next
>     }
> }
>
> The above would start the optimization in a separate thread that does
> not / SHOULD NOT share data with the main thread.

Cf. previous paragraph: I think that can be done in a layer above CM.

>>>> The current reporting is based on exceptions, and assumes that if
>>>> no
>>>> exception was thrown, then the user's request completed
>>>> successfully.
>>> Sure - personally I'd much rather deal with something similar to an
>>> HTTP status code in a callback, than an exception .  I think the
>>> code
>>> is cleaner and the calback makes it more elegant to apply an
>>> adaptive
>>> approach to handling the response, like slightly relaxing
>>> constraints,
>>> convergence parameters, etc.  Also by getting rid of the
>>> exceptions,
>>> we no longer depend on the I18N layer that they are tied to and now
>>> the messages can be more informative, since they target the root
>>> cause.  The observer can also run in the 'main' thread' while the
>>> optimization can run asynchronously.  Also WRT JDK9 and modules,
>>> loosing the exceptions would mean one less dependency when the
>>> library
>>> is up into JDK9 modules...which would be more in line with this
>>> philosophy:
>>> https://github.com/substack/browserify-handbook#module-philosophy
>>
>> I'm not sure I fully understood the philosophy from the text in this
>> short paragraph.
>> But I do not agree with the idea that the possibility to quickly
>> find
>> some code is more important than standards and best practices.
>
> If you go to npmjs.org and type in Neural Network you will get 56
> results all linked to github repositories.
>
> In addition there's meta data indicating number of downloads in the
> last day, last month, etc.  Try typing in cosine.  Odds are you will
> find a package that does just want you want and nothing else.  This
> is
> very underwhelming and refreshing in terms of cloning off of github
> and getting familar with tests etc.  Also eye opening.  How many of
> us
> knew that we could do that much stuff with cosine! :).

I really don't mean to question the quality of any of those
implementations,
but the issue is there: How to choose?
That there are so many of them sort of defeats the purpose of "quickly
find what you need".

It seems (?) that the consequence of this modularity (?) is to
encourage
the creation of many independent/competing/duplicate projects of small
teams (I'd guess, a 1-person-team, in most cases).

>>>> I totally agree that in some circumstances, more information on
>>>> the
>>>> inner working of an algorithm would be quite useful.
>>> ... Algorithm iterations become unit testable.
>>>>
>>>> But I don't see the point in devoting resources to reinvent the
>>>> wheel:
>>> You mean pimping the wheel?  Big pimpin.
>>
>> I think that logging statements are easy to add, not disruptive at
>> all,
>> and come in handy to understand a code's unexpected behaviour.
>> Assuming that a "logging" feature is useful, it can be added *now*
>> using
>> a dependency towards a weight-less (!) framework such as "slf4j".
>> IMO, it would be a waste of time to implement a new communication
>> layer
>> that can do that, and more, if it would be used for logging only in
>> 99%
>> of the cases.
> SLF4J is used by almost every other framework, so why not use it?

Good question: I also asked it quite some time ago.
Didn't get a satisfying answer. Boiled down to "no dependency" policy.

> Logging and the diagnostic could be used together.  The primary
> purpose of the diagnostic though is to collect data that will be
> useful in `sugarHoneyIceTea`.

I'm not sure I understand correctly the purpose: if the "Solution" is
found, do you ever need more "context" (i.e. "Result", "Diagnostics")?

If it is only necessary in case of failure, CM's exception can already
carry context information.  As I wrote above, such an exception could
be caught by a wrapper (not necessarily part of the CM "core") and
translated into whatever the upper layer expect (e.g. "Diagnostics").

>>
>>>>
>>>> I longed several times for the use of a logging library.
>>>> The only show-stopper has been the informal "no-dependency"
>>>> policy...
>>> JDK9 Jigsaw should solve dependency hell, so the less coupling
>>> between commons math classes the better.
>>
>> I wouldn't call "coupling" the dependency towards exception classes:
>> they are little utilities that can make sense in various parts of
>> the
>> library.
>
> If for example the Simplex solver is broken off into it's own module,
> then it has to be coupled to the exceptions, unless it is exception
> free.

Why is it a problem to be coupled with a few tiny exception classes?

Then if it is really a problem, we can indeed define "local" exceptions
for each package.

>>
>> [Unless one wants to embark on yet another discussion about
>> exceptions;
>> whether there should be one class for each of the "messages" that
>> exist
>> in "LocalizedFormats"; whether localization should be done in CM;
>> etc.]
>
> I think it would be best to just eliminate the exceptions.

I'd think that most users of CM should deem that dangerous.
An exception is relatively difficult to ignore unknowingly (and was
rightfully a better alternative the old "check the return value").

>>
>>> Anyways I'm obviously
>>> interested in playing with this stuff, so when I get something up
>>> into
>>> a repository I'll to do a callback :).
>>
>> If you are interested in big overhauls, there is one that gathered
>> relative consensus: rewrite the algorithms in a
>> "multithread-friendly"
>> way.
> I think that's a tall order that will take us into JDK88 :).

That would be a real pity.
I recall a nit-picking discussion about how to initialize the
"FastMath"
class in order to gain a few _milliseconds_. :-/

> But
> using callbacks and making potentially long running computations
> asynchronous could be a middle ground that would allow simple multi
> threaded use without fiddling around under the hood...

Cf. above (this does not need ad-hoc CM code, beyond the relevant
classes
implementing "Runnable" and/or "Callable").

>> Some ideas were floated (cf. ML archive) but no implementation or
>> experiment...  Perhaps with a well-defined goal such as performance
>> improvement, your design suggestions will become clearer to more
>> people.
>>
>> AFAIK, only the classes in the "o.a.c.m.neuralnet" package are
>> currently
>> ready to be used with the "java.util.concurrent" framework.
> FWIU Neural Nets are a great fit for concurrency.

Quite true.

But even the optimizers could benefit from just being able to use
more threads: It is often (always?) necessary to evaluate the objective
function "N" times per iteration. So, the computation could be about
   min(N, numCores)
times faster.

> I think for the
> others we will end up having discussions around how users would
> control the number of threads, etc. again that makes some of us
> nervous.

One additional parameters: numCores.

> An asynchronous operation that runs in one separate thread
> is easier to reason about.

Sure.

But then we should stop talking about performance on this list. ;-}

> If we want to test 10 neural net
> configurations, and we have 10 cores, then we can start each by
> itself
> by doing something like:
>
>
> Nework.setAsync(true).addNeurons().connectNeurons().addObserver(observer).start().
> //Now do 10 more
> //If the observer is shared then notifications should be thread safe.

I had a similar argument for not making "FastMath" initialization
faster
(at the cost of a lot of additional code):  It was rejected...

Regards,
Gilles


P.S. I think that several issues evoked in this thread could warrant
opening
      their own thread, to gather more opinions on actual actions to be
taken.


> Cheers,
> - Ole
>
> P.S. Dang that was a long email.  If I write one more of these, ban
> me :)

My fault: I should not keep answering! ;-)


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] LeastSquaresOptimizer Design

ole ersoy
On 09/22/2015 06:46 AM, Gilles wrote:

> Hi.
>
> On Mon, 21 Sep 2015 19:55:15 -0500, Ole Ersoy wrote:
>> Hola,
>>
>> On 09/21/2015 04:15 PM, Gilles wrote:
>>> Hi.
>>>
>>> On Sun, 20 Sep 2015 15:04:08 -0500, Ole Ersoy wrote:
>>>> On 09/20/2015 05:51 AM, Gilles wrote:
>>>>> On Sun, 20 Sep 2015 01:12:49 -0500, Ole Ersoy wrote:
>>>>>> Wanted to float some ideas for the LeastSquaresOptimizer (Possibly
>>>>>> General Optimizer) design.  For example with the
>>>>>> LevenbergMarquardtOptimizer we would do:
>>>>>> `LevenbergMarquardtOptimizer.optimize(OptimizationContext c);`
>>>>>>
>>>>>> Rough optimize() outline:
>>>>>> public static void optimise() {
>>>>>> //perform the optimization
>>>>>> //If successful
>>>>>>     c.notify(LevenberMarquardtResultsEnum.SUCCESS, solution);
>>>>>> //If not successful
>>>>>>
>>>>>>
>>>>>>
>>>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_COST_RELATIVE_TOLERANCE,
>>>>>> diagnostic);
>>>>>> //or
>>>>>>
>>>>>>
>>>>>>
>>>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_PARAMETERS_RELATIVE_TOLERANCE,
>>>>>> diagnostic)
>>>>>> //etc
>>>>>> }
>>>>>>
>>>>>> The diagnostic, when turned on, will contain a trace of the last N
>>>>>> iterations leading up to the failure.  When turned off, the Diagnostic
>>>>>> instance only contains the parameters used to detect failure. The
>>>>>> diagnostic could be viewed as an indirect way to log optimizer
>>>>>> iterations.
>>>>>>
>>>>>> WDYT?
>>>>>
>>>>> I'm wary of having several different ways to convey information to the
>>>>> caller.
>>>> It would just be one way.
>>>
>>> One way for optimizer, one way for solvers, one way for ...
>>
>> Yes I see what you mean, but I think on a whole it will be worth it
>> to add additional sugar code that removes the need for exceptions.
>
> Isn't always possible to wrap exception-generating code so that upper
> layers do not see them?
The layer that calls the commons math function will see errors through the callback interface.  They are not exceptions though. The error is encoded as an Enum and is specific to the calling code...not generic across multiple classes.

>
> The interface would have to know how to handle them and propagate the
> information in some other form (callback).
Yes I think now we are saying the same thing.

>
>>>
>>>> But the caller may not be the receiver
>>>> (It could be).  The receiver would be an observer attached to the
>>>> OptimizationContext that implements an interface allowing it to observe
>>>> the optimization.
>>>
>>> I'm afraid that it will add to the questions of what to put in the
>>> code and how.  [We already had sometimes heated discussions just for
>>> the IMHO obvious (e.g. code formatting, documentation, exception...).]
>>
>> Hehe.  Yes I remember some of these discussions.  I wonder how much
>> time was spent debating the exceptions alone?  Surely everyone must
>> have had this feeling in pit of their stomach that there's got to be a
>> better way.  On the exception topic, these are some of the issues:
>>
>> I18N
>> ===================
>> If you are new to commons math and thinking about designing a commons
>> math compatible exception you should probably understand the I18N
>> stuff that's bound to exception (and wonder why it's bound the the
>> exception).  Grab a coffee and spend a few hours, unless you are
>> obviously fairly new to Java like some ofthe people posting for help.
>> In this case when the exception occurs, there is going to be a lot of
>> tutoring going on on the users list.
>
> I already said all I had to say about this; it's in the archive.
> Summary: I agree that it shouldn't be here.
>
>> Number of Exceptions
>> ===================
>> Before you do actually design a new exception, you should probably
>> see if there is an exception that already fits the category of what
>> you are doing.  So you start reading.  Exception1...nop
>> Exception2...nop...Exception3...Exception999..But I think I'm getting
>> warmer.  OK - Did not find it ... but I'm fairly certain that there is
>> a elegant place for it somewhere in the exception hierarchy...
>
> On this, I also explained at length my views (assuming that exceptions
> are part of the design).
> Summary: an exception indicates that something went wrong, and the caller
> should not hope to get anything good out of the call that raised the
> exception (i.e. he _must_ craft another call that meets the requirements
> of the code).

And further down it is noted that if the caller wants to deal with the exceptions directly based on the call then the caller can create a wrapper for each commons math function throwing the exception.  So the most elegant way of doing this is probably one wrapper per class.  And the interface for the wrapper is left up to the designer.

Or we could get rid of the exceptions, design a callback interface for each solver / optimizer / etc.  They should be pretty similar across commons math.  In general they have two methods:

success(solution);
error(Enum.CODE, [Diagnostics])

Enum.CODE would be used for I18N.

The Diagnostics are optional, but I would say at a minimum at propagate the same information that exceptions are propagating now.

>> Handling of Exceptions
>> ===================
>> If our app uses several of the commons math classes (That throw
>> exceptions of the same type), and one of those classes throws an
>> exception,what is the app supposed to do?
>
> Cf. previous paragraph.
Diddo.

>
>> I think most developers would find that question somewhat
>> challenging.  There are numerous strategies.  Catch all exceptions and
>> log what happened, etc.  But what if the requirement is that if an
>> exception is thrown, the organization that receives it has 0 seconds
>> to get to the root cause of it and understand the dynamics. Is this
>> doable?  (Yes obviously, but how hard is it...?).
>
> Cf. previous paragraph.
> In effect, you describe an upper layer's requirement (handling an expected
> "unexpected(!) failure"). IMHO, it's out CM's realm (CM raises the exception,
> end of story).

Or CM detects that it cannot provide a solution and sends the message via coded as an Enum via the callback interface.

>
>>>>> It seems that the reporting interfaces could quickly overwhelm
>>>>> the "actual" code (one type of context per algorithm).
>>>> There would one type of Observer interface per algorithm. It would
>>>> act on the solution and what are currently exceptions, although these
>>>> would be translated into enums.
>>>
>>> Unless I'm mistaken, the most common use-case for codes implemented
>>> in a library such as CM is to provide a correct answer or bail out
>>> in a non-equivocal way.
>> Most java developers are used to synchronous coding...call the method
>> get the response...catch the exception if needed.  This is changing
>> with JDK8, and as we evolve and start using lambdas, we become more
>> accustomed to the functional callback style of programming.
>> Personally I want to be able to use an API that gives me what I need
>> when everything works as expected, allows me to resolve unexpected
>> issues with minimal effort, and is as simple, fluid, and lightweight
>> as possible.
>
> I've not yet used Java 8; I would have if we were allowed to use it in
> CM...

I think it's pretty sweet :).  Spring recommends that everyone upgrade.

>
>
> However, I'm not convinced that asynchronicity should be dealt with
> at the CM level, beyond making its algorithms multi-thread friendly.
> IMO, this is the important change (that can make a big difference,
> performance-wise, on machines with multiple cores).
> Then developers can use the standard tools in "java.util.concurrent"
> to select a runtime policy (single/multi-thread and/or (a)synchronous).
That sounds good.  For the asynchronous option the user is probably going to want be notified via a callback...unless there's another option?

>
>
>>> It would make the code more involved to handle a minority of
>>> (undefined) cases. [Actual examples would be welcome in order to
>>> focus the discussion.]
>>
>> Rough Outline (I've evolved the concept and moved away from the
>> OptimizationContext in the process of writing):
>>
>> interface LevenbergMarquardtObserver {
>>
>>     public void hola(Solution s);
>>     public void sugarHoneyIceTea(ResultType rt, Dianostics d);
>> }
>>
>> public class LMObserver implements LevenbergMarquardtObserver {
>>
>>    private Application application;
>>
>>    public LMObserver(Application application) {
>>        this.application = application;
>>    }
>>
>>    public void hola(ResultType rt, Solution s) {
>>                 application.next(solution);
>>    }
>>
>>    public void sugarHoneyIceTea(ResultType rt, Diagnostic s)
>>        if (rt == ResultType.I_GOT_THIS_ONE) {
>>             //I looked at the commons unit tests for this algorithm
>> evaluating
>>             //the diagnostics that shows how this failure can occur
>>             //I'm totally fixing this!  Steps aside!
>>        }
>>        else if (rt == ResultType.REALLY_COMPLICATED_STUFF)
>>        {
>>            //We need our best engineers...call India.
>>        }
>>   )
>>
>>
>> public class Application {
>>     //Note nothing is returned.
>>     LevenberMarquardtOptimizer.setOberver(new
>> LMObserver(this)).setLeastSquaresProblem(new
>> ClassThatImplementsTheProblem())).start();
>>
>>     public void next(Solution solution) {
>>
>>         //Do cool stuff.
>>
>>     }
>> }
>>
>> Or an asynchronous variation:
>>
>> public class Application {
>> //This call will not block because async is true
>>     LevenberMarquardtOptimizer.setAsync(true).setOberver(new
>> LMObserver()).setLeastSquaresProblem(new
>> ClassThatImplementsTheProblem())).start();
>>
>>     //Do more stuff right away.
>>
>>     public void next(Solution solution) {
>>         //When the thread running the optimization is done, this
>> method is called back.
>>         //Do whatever comes next
>>     }
>> }
>>
>> The above would start the optimization in a separate thread that does
>> not / SHOULD NOT share data with the main thread.
>
> Cf. previous paragraph: I think that can be done in a layer above CM.
But CM will be better if it dumps the exceptions and takes a more direct approach.

>
>>>>> The current reporting is based on exceptions, and assumes that if no
>>>>> exception was thrown, then the user's request completed successfully.
>>>> Sure - personally I'd much rather deal with something similar to an
>>>> HTTP status code in a callback, than an exception .  I think the code
>>>> is cleaner and the calback makes it more elegant to apply an adaptive
>>>> approach to handling the response, like slightly relaxing constraints,
>>>> convergence parameters, etc.  Also by getting rid of the exceptions,
>>>> we no longer depend on the I18N layer that they are tied to and now
>>>> the messages can be more informative, since they target the root
>>>> cause.  The observer can also run in the 'main' thread' while the
>>>> optimization can run asynchronously.  Also WRT JDK9 and modules,
>>>> loosing the exceptions would mean one less dependency when the library
>>>> is up into JDK9 modules...which would be more in line with this
>>>> philosophy:
>>>> https://github.com/substack/browserify-handbook#module-philosophy
>>>
>>> I'm not sure I fully understood the philosophy from the text in this
>>> short paragraph.
>>> But I do not agree with the idea that the possibility to quickly find
>>> some code is more important than standards and best practices.
>>
>> If you go to npmjs.org and type in Neural Network you will get 56
>> results all linked to github repositories.
>>
>> In addition there's meta data indicating number of downloads in the
>> last day, last month, etc.  Try typing in cosine.  Odds are you will
>> find a package that does just want you want and nothing else. This is
>> very underwhelming and refreshing in terms of cloning off of github
>> and getting familar with tests etc.  Also eye opening.  How many of us
>> knew that we could do that much stuff with cosine! :).
>
> I really don't mean to question the quality of any of those implementations,
> but the issue is there: How to choose?
For math ATM it's more obscure.  For other libraries like superagent:
https://www.npmjs.com/package/superagent

With 30K downloads in the last day, it's much easier to be assured that it's the way to go.  It's also a fantastic way to learn REST.

>
> That there are so many of them sort of defeats the purpose of "quickly
> find what you need".
Yes - NodeJS is relatively new, so some of the packages have not had much time to 'simmer'.

>
> It seems (?) that the consequence of this modularity (?) is to encourage
> the creation of many independent/competing/duplicate projects of small
> teams (I'd guess, a 1-person-team, in most cases).
Well if you look at Superagent for example, the team size is significant.  But it's not so much the team size.  It's what superagent does.  It does one thing, and does it really well.

>
>>>>> I totally agree that in some circumstances, more information on the
>>>>> inner working of an algorithm would be quite useful.
>>>> ... Algorithm iterations become unit testable.
>>>>>
>>>>> But I don't see the point in devoting resources to reinvent the wheel:
>>>> You mean pimping the wheel?  Big pimpin.
>>>
>>> I think that logging statements are easy to add, not disruptive at all,
>>> and come in handy to understand a code's unexpected behaviour.
>>> Assuming that a "logging" feature is useful, it can be added *now* using
>>> a dependency towards a weight-less (!) framework such as "slf4j".
>>> IMO, it would be a waste of time to implement a new communication layer
>>> that can do that, and more, if it would be used for logging only in 99%
>>> of the cases.
>> SLF4J is used by almost every other framework, so why not use it?
>
> Good question: I also asked it quite some time ago.
> Didn't get a satisfying answer. Boiled down to "no dependency" policy.
I think it would be worth examining breaking up CM into JDK9 modules and then utilizing SLF4J when it adds value.

>
>> Logging and the diagnostic could be used together.  The primary
>> purpose of the diagnostic though is to collect data that will be
>> useful in `sugarHoneyIceTea`.
>
> I'm not sure I understand correctly the purpose: if the "Solution" is
> found, do you ever need more "context" (i.e. "Result", "Diagnostics")?

If the solution is found, then the method calls cb.notify(solution).  The diagnostic is never created.
If the solution is not found, then the method calls cb.error(ErrorCodeEnum.THE_CODE_WITH_REALLY_GOOD_JAVADOC_DESCRIBING_THE_ERROR, Diagnostic);

interface Callback {
    notify(Solution solution);
    error(ErrorCodeEnum enum)
}

The big difference here between exceptions and an ErrorCodeEnum is that the ErrorCodeEnum is tied to the root of what happened.  For example if on line 335 the code detects that it will not converge, then an Enum code is created for this specific case, unit tested for this specific case, and documented for this specific case.

Suppose that the something similar occurs at three different points in the code.  The code could throw the same exception three times, for three different reasons, but it's still the same exception.  So if you are the API user - which would you rather have?  Codes that tell you precisely what happened, or dig through the exceptions so that you can derive what happened in your wrapper code?

>
> If it is only necessary in case of failure, CM's exception can already
> carry context information.  As I wrote above, such an exception could
> be caught by a wrapper (not necessarily part of the CM "core") and
> translated into whatever the upper layer expect (e.g. "Diagnostics").

I think CM needs to examine the amount of overhead that exceptions cause with respect to developers and API users.

>
>>>
>>>>>
>>>>> I longed several times for the use of a logging library.
>>>>> The only show-stopper has been the informal "no-dependency" policy...
>>>> JDK9 Jigsaw should solve dependency hell, so the less coupling
>>>> between commons math classes the better.
>>>
>>> I wouldn't call "coupling" the dependency towards exception classes:
>>> they are little utilities that can make sense in various parts of the
>>> library.
>>
>> If for example the Simplex solver is broken off into it's own module,
>> then it has to be coupled to the exceptions, unless it is exception
>> free.
>
> Why is it a problem to be coupled with a few tiny exception classes?
If it's necessary then it's necessary.  If put put an exception free design next to the one with exceptions and look at both through the lens of:
- Which is best for developer productivity?
- Which is best in terms of API user productivity?

> Then if it is really a problem, we can indeed define "local" exceptions
> for each package.
Or not.

>
>>>
>>> [Unless one wants to embark on yet another discussion about exceptions;
>>> whether there should be one class for each of the "messages" that exist
>>> in "LocalizedFormats"; whether localization should be done in CM;
>>> etc.]
>>
>> I think it would be best to just eliminate the exceptions.
>
> I'd think that most users of CM should deem that dangerous.
Or refreshing once they see the alternative.

> An exception is relatively difficult to ignore unknowingly (and was
> rightfully a better alternative the old "check the return value").

This is another issue.  Runtime exception can be ignored.

>
>>>
>>>> Anyways I'm obviously
>>>> interested in playing with this stuff, so when I get something up into
>>>> a repository I'll to do a callback :).
>>>
>>> If you are interested in big overhauls, there is one that gathered
>>> relative consensus: rewrite the algorithms in a "multithread-friendly"
>>> way.
>> I think that's a tall order that will take us into JDK88 :).
>
> That would be a real pity.
> I recall a nit-picking discussion about how to initialize the "FastMath"
> class in order to gain a few _milliseconds_. :-/
>
>> But
>> using callbacks and making potentially long running computations
>> asynchronous could be a middle ground that would allow simple multi
>> threaded use without fiddling around under the hood...
>
> Cf. above (this does not need ad-hoc CM code, beyond the relevant classes
> implementing "Runnable" and/or "Callable").
That's what I like about it.  It's simple.  So if CM algorithms get an `async` option, combined with a callback interface, then the implementation is pretty straightforward.

>
>>> Some ideas were floated (cf. ML archive) but no implementation or
>>> experiment...  Perhaps with a well-defined goal such as performance
>>> improvement, your design suggestions will become clearer to more people.
>>>
>>> AFAIK, only the classes in the "o.a.c.m.neuralnet" package are currently
>>> ready to be used with the "java.util.concurrent" framework.
>> FWIU Neural Nets are a great fit for concurrency.
>
> Quite true.
>
> But even the optimizers could benefit from just being able to use
> more threads: It is often (always?) necessary to evaluate the objective
> function "N" times per iteration. So, the computation could be about
>   min(N, numCores)
> times faster.

JDK8 streams make this really simple.  The part that is not so simple is specifying the number of threads or perhaps a percentage of the number of cores.  If we could do:

Solve.optimize(problem, config.numberOfThreads);
Solve.optimize(problem, targetCoreUtilitzation);

I think more of us would find JDK8 and streams attractive.


>
>> I think for the
>> others we will end up having discussions around how users would
>> control the number of threads, etc. again that makes some of us
>> nervous.
>
> One additional parameters: numCores.
>
>> An asynchronous operation that runs in one separate thread
>> is easier to reason about.
>
> Sure.
>
> But then we should stop talking about performance on this list. ;-}
Well - code that is single threaded is easier to reason about. There are two different ways to utilize the cores.  Create 10 solvers solving 10 problems using asynchronous invocations, or run the 10 problems synchronously using a concurrent algorithm.

If there's only one problem to solve, then obviously we need the concurrency, but the asynchronous option is the low hanging fruit.

>
>> If we want to test 10 neural net
>> configurations, and we have 10 cores, then we can start each by itself
>> by doing something like:
>>
>>
>> Nework.setAsync(true).addNeurons().connectNeurons().addObserver(observer).start().
>> //Now do 10 more
>> //If the observer is shared then notifications should be thread safe.
>
> I had a similar argument for not making "FastMath" initialization faster
> (at the cost of a lot of additional code):  It was rejected...
Perhaps if FastMath was more modular, the pill would have been easier to swallow?

Cheers,
- Ole

>
> Regards,
> Gilles
>
>
> P.S. I think that several issues evoked in this thread could warrant opening
>      their own thread, to gather more opinions on actual actions to be taken.
I think that would be good.  Some of these topics could definitely use a different subject heading.
- Exception removal / Enum coding of errors
- I18N messages corresponding to Enums
- Callbacks
- Asynchronous option
- JDK9 Modularity
- JDK8 Stream API (At least if streams are used, then CM is one steps closer to simple concurrency).

There were a few more cans that I did not open yet.  One is wrapping method arguments in a self validating context, eliminating the need for CM method calls to check and throw exceptions for things like nulls, etc.

Instead the client would construct the context with all parameters. The code would call:

context.valid(CB);

If the context is invalid, then the CB.notify(ErrorEnum code) is called.  This removes exceptions due to invalid arguments.

>
>
>> Cheers,
>> - Ole
>>
>> P.S. Dang that was a long email.  If I write one more of these, ban me :)
>
> My fault: I should not keep answering! ;-)
I'm banning myself :)

>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] LeastSquaresOptimizer Design

ole ersoy
In reply to this post by Gilles Sadowski
One more thing - This is separate from the other stuff.  The LMOptimizer has several configuration properties, with corresponding getters and a corresponding with() API.  It would be good if these existed on their own class that used Lombok (https://projectlombok.org/) to generate (Byte code) the fluid API and the getters.  That would eliminate and estimated 40% of the source code.  Lombok annotations, such as @NonNull, could be used to generate NPE checks in the configuration instance.  This way CM does not have to check for NPEs anymore.  The will be thrown when the configuration instance is constructed.

This is one reason I was thinking it might be useful to have an context holding things like:
- configuration
- callback
- problem

But I think it's fine if these are passed in separately as well. It's cleaner to call callback.notify() than context.callback.notify().

Cheers,
- Ole


On 09/22/2015 06:46 AM, Gilles wrote:

> Hi.
>
> On Mon, 21 Sep 2015 19:55:15 -0500, Ole Ersoy wrote:
>> Hola,
>>
>> On 09/21/2015 04:15 PM, Gilles wrote:
>>> Hi.
>>>
>>> On Sun, 20 Sep 2015 15:04:08 -0500, Ole Ersoy wrote:
>>>> On 09/20/2015 05:51 AM, Gilles wrote:
>>>>> On Sun, 20 Sep 2015 01:12:49 -0500, Ole Ersoy wrote:
>>>>>> Wanted to float some ideas for the LeastSquaresOptimizer (Possibly
>>>>>> General Optimizer) design.  For example with the
>>>>>> LevenbergMarquardtOptimizer we would do:
>>>>>> `LevenbergMarquardtOptimizer.optimize(OptimizationContext c);`
>>>>>>
>>>>>> Rough optimize() outline:
>>>>>> public static void optimise() {
>>>>>> //perform the optimization
>>>>>> //If successful
>>>>>>     c.notify(LevenberMarquardtResultsEnum.SUCCESS, solution);
>>>>>> //If not successful
>>>>>>
>>>>>>
>>>>>>
>>>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_COST_RELATIVE_TOLERANCE,
>>>>>> diagnostic);
>>>>>> //or
>>>>>>
>>>>>>
>>>>>>
>>>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_PARAMETERS_RELATIVE_TOLERANCE,
>>>>>> diagnostic)
>>>>>> //etc
>>>>>> }
>>>>>>
>>>>>> The diagnostic, when turned on, will contain a trace of the last N
>>>>>> iterations leading up to the failure.  When turned off, the Diagnostic
>>>>>> instance only contains the parameters used to detect failure. The
>>>>>> diagnostic could be viewed as an indirect way to log optimizer
>>>>>> iterations.
>>>>>>
>>>>>> WDYT?
>>>>>
>>>>> I'm wary of having several different ways to convey information to the
>>>>> caller.
>>>> It would just be one way.
>>>
>>> One way for optimizer, one way for solvers, one way for ...
>>
>> Yes I see what you mean, but I think on a whole it will be worth it
>> to add additional sugar code that removes the need for exceptions.
>
> Isn't always possible to wrap exception-generating code so that upper
> layers do not see them?
> The interface would have to know how to handle them and propagate the
> information in some other form (callback).
>
>>>
>>>> But the caller may not be the receiver
>>>> (It could be).  The receiver would be an observer attached to the
>>>> OptimizationContext that implements an interface allowing it to observe
>>>> the optimization.
>>>
>>> I'm afraid that it will add to the questions of what to put in the
>>> code and how.  [We already had sometimes heated discussions just for
>>> the IMHO obvious (e.g. code formatting, documentation, exception...).]
>>
>> Hehe.  Yes I remember some of these discussions.  I wonder how much
>> time was spent debating the exceptions alone?  Surely everyone must
>> have had this feeling in pit of their stomach that there's got to be a
>> better way.  On the exception topic, these are some of the issues:
>>
>> I18N
>> ===================
>> If you are new to commons math and thinking about designing a commons
>> math compatible exception you should probably understand the I18N
>> stuff that's bound to exception (and wonder why it's bound the the
>> exception).  Grab a coffee and spend a few hours, unless you are
>> obviously fairly new to Java like some ofthe people posting for help.
>> In this case when the exception occurs, there is going to be a lot of
>> tutoring going on on the users list.
>
> I already said all I had to say about this; it's in the archive.
> Summary: I agree that it shouldn't be here.
>
>> Number of Exceptions
>> ===================
>> Before you do actually design a new exception, you should probably
>> see if there is an exception that already fits the category of what
>> you are doing.  So you start reading.  Exception1...nop
>> Exception2...nop...Exception3...Exception999..But I think I'm getting
>> warmer.  OK - Did not find it ... but I'm fairly certain that there is
>> a elegant place for it somewhere in the exception hierarchy...
>
> On this, I also explained at length my views (assuming that exceptions
> are part of the design).
> Summary: an exception indicates that something went wrong, and the caller
> should not hope to get anything good out of the call that raised the
> exception (i.e. he _must_ craft another call that meets the requirements
> of the code).
>
>> Handling of Exceptions
>> ===================
>> If our app uses several of the commons math classes (That throw
>> exceptions of the same type), and one of those classes throws an
>> exception,what is the app supposed to do?
>
> Cf. previous paragraph.
>
>> I think most developers would find that question somewhat
>> challenging.  There are numerous strategies.  Catch all exceptions and
>> log what happened, etc.  But what if the requirement is that if an
>> exception is thrown, the organization that receives it has 0 seconds
>> to get to the root cause of it and understand the dynamics. Is this
>> doable?  (Yes obviously, but how hard is it...?).
>
> Cf. previous paragraph.
> In effect, you describe an upper layer's requirement (handling an expected
> "unexpected(!) failure"). IMHO, it's out CM's realm (CM raises the exception,
> end of story).
>
>>>>> It seems that the reporting interfaces could quickly overwhelm
>>>>> the "actual" code (one type of context per algorithm).
>>>> There would one type of Observer interface per algorithm. It would
>>>> act on the solution and what are currently exceptions, although these
>>>> would be translated into enums.
>>>
>>> Unless I'm mistaken, the most common use-case for codes implemented
>>> in a library such as CM is to provide a correct answer or bail out
>>> in a non-equivocal way.
>> Most java developers are used to synchronous coding...call the method
>> get the response...catch the exception if needed.  This is changing
>> with JDK8, and as we evolve and start using lambdas, we become more
>> accustomed to the functional callback style of programming.
>> Personally I want to be able to use an API that gives me what I need
>> when everything works as expected, allows me to resolve unexpected
>> issues with minimal effort, and is as simple, fluid, and lightweight
>> as possible.
>
> I've not yet used Java 8; I would have if we were allowed to use it in
> CM...
>
> However, I'm not convinced that asynchronicity should be dealt with
> at the CM level, beyond making its algorithms multi-thread friendly.
> IMO, this is the important change (that can make a big difference,
> performance-wise, on machines with multiple cores).
> Then developers can use the standard tools in "java.util.concurrent"
> to select a runtime policy (single/multi-thread and/or (a)synchronous).
>
>>> It would make the code more involved to handle a minority of
>>> (undefined) cases. [Actual examples would be welcome in order to
>>> focus the discussion.]
>>
>> Rough Outline (I've evolved the concept and moved away from the
>> OptimizationContext in the process of writing):
>>
>> interface LevenbergMarquardtObserver {
>>
>>     public void hola(Solution s);
>>     public void sugarHoneyIceTea(ResultType rt, Dianostics d);
>> }
>>
>> public class LMObserver implements LevenbergMarquardtObserver {
>>
>>    private Application application;
>>
>>    public LMObserver(Application application) {
>>        this.application = application;
>>    }
>>
>>    public void hola(ResultType rt, Solution s) {
>>                 application.next(solution);
>>    }
>>
>>    public void sugarHoneyIceTea(ResultType rt, Diagnostic s)
>>        if (rt == ResultType.I_GOT_THIS_ONE) {
>>             //I looked at the commons unit tests for this algorithm
>> evaluating
>>             //the diagnostics that shows how this failure can occur
>>             //I'm totally fixing this!  Steps aside!
>>        }
>>        else if (rt == ResultType.REALLY_COMPLICATED_STUFF)
>>        {
>>            //We need our best engineers...call India.
>>        }
>>   )
>>
>>
>> public class Application {
>>     //Note nothing is returned.
>>     LevenberMarquardtOptimizer.setOberver(new
>> LMObserver(this)).setLeastSquaresProblem(new
>> ClassThatImplementsTheProblem())).start();
>>
>>     public void next(Solution solution) {
>>
>>         //Do cool stuff.
>>
>>     }
>> }
>>
>> Or an asynchronous variation:
>>
>> public class Application {
>> //This call will not block because async is true
>>     LevenberMarquardtOptimizer.setAsync(true).setOberver(new
>> LMObserver()).setLeastSquaresProblem(new
>> ClassThatImplementsTheProblem())).start();
>>
>>     //Do more stuff right away.
>>
>>     public void next(Solution solution) {
>>         //When the thread running the optimization is done, this
>> method is called back.
>>         //Do whatever comes next
>>     }
>> }
>>
>> The above would start the optimization in a separate thread that does
>> not / SHOULD NOT share data with the main thread.
>
> Cf. previous paragraph: I think that can be done in a layer above CM.
>
>>>>> The current reporting is based on exceptions, and assumes that if no
>>>>> exception was thrown, then the user's request completed successfully.
>>>> Sure - personally I'd much rather deal with something similar to an
>>>> HTTP status code in a callback, than an exception .  I think the code
>>>> is cleaner and the calback makes it more elegant to apply an adaptive
>>>> approach to handling the response, like slightly relaxing constraints,
>>>> convergence parameters, etc.  Also by getting rid of the exceptions,
>>>> we no longer depend on the I18N layer that they are tied to and now
>>>> the messages can be more informative, since they target the root
>>>> cause.  The observer can also run in the 'main' thread' while the
>>>> optimization can run asynchronously.  Also WRT JDK9 and modules,
>>>> loosing the exceptions would mean one less dependency when the library
>>>> is up into JDK9 modules...which would be more in line with this
>>>> philosophy:
>>>> https://github.com/substack/browserify-handbook#module-philosophy
>>>
>>> I'm not sure I fully understood the philosophy from the text in this
>>> short paragraph.
>>> But I do not agree with the idea that the possibility to quickly find
>>> some code is more important than standards and best practices.
>>
>> If you go to npmjs.org and type in Neural Network you will get 56
>> results all linked to github repositories.
>>
>> In addition there's meta data indicating number of downloads in the
>> last day, last month, etc.  Try typing in cosine.  Odds are you will
>> find a package that does just want you want and nothing else. This is
>> very underwhelming and refreshing in terms of cloning off of github
>> and getting familar with tests etc.  Also eye opening.  How many of us
>> knew that we could do that much stuff with cosine! :).
>
> I really don't mean to question the quality of any of those implementations,
> but the issue is there: How to choose?
> That there are so many of them sort of defeats the purpose of "quickly
> find what you need".
>
> It seems (?) that the consequence of this modularity (?) is to encourage
> the creation of many independent/competing/duplicate projects of small
> teams (I'd guess, a 1-person-team, in most cases).
>
>>>>> I totally agree that in some circumstances, more information on the
>>>>> inner working of an algorithm would be quite useful.
>>>> ... Algorithm iterations become unit testable.
>>>>>
>>>>> But I don't see the point in devoting resources to reinvent the wheel:
>>>> You mean pimping the wheel?  Big pimpin.
>>>
>>> I think that logging statements are easy to add, not disruptive at all,
>>> and come in handy to understand a code's unexpected behaviour.
>>> Assuming that a "logging" feature is useful, it can be added *now* using
>>> a dependency towards a weight-less (!) framework such as "slf4j".
>>> IMO, it would be a waste of time to implement a new communication layer
>>> that can do that, and more, if it would be used for logging only in 99%
>>> of the cases.
>> SLF4J is used by almost every other framework, so why not use it?
>
> Good question: I also asked it quite some time ago.
> Didn't get a satisfying answer. Boiled down to "no dependency" policy.
>
>> Logging and the diagnostic could be used together.  The primary
>> purpose of the diagnostic though is to collect data that will be
>> useful in `sugarHoneyIceTea`.
>
> I'm not sure I understand correctly the purpose: if the "Solution" is
> found, do you ever need more "context" (i.e. "Result", "Diagnostics")?
>
> If it is only necessary in case of failure, CM's exception can already
> carry context information.  As I wrote above, such an exception could
> be caught by a wrapper (not necessarily part of the CM "core") and
> translated into whatever the upper layer expect (e.g. "Diagnostics").
>
>>>
>>>>>
>>>>> I longed several times for the use of a logging library.
>>>>> The only show-stopper has been the informal "no-dependency" policy...
>>>> JDK9 Jigsaw should solve dependency hell, so the less coupling
>>>> between commons math classes the better.
>>>
>>> I wouldn't call "coupling" the dependency towards exception classes:
>>> they are little utilities that can make sense in various parts of the
>>> library.
>>
>> If for example the Simplex solver is broken off into it's own module,
>> then it has to be coupled to the exceptions, unless it is exception
>> free.
>
> Why is it a problem to be coupled with a few tiny exception classes?
>
> Then if it is really a problem, we can indeed define "local" exceptions
> for each package.
>
>>>
>>> [Unless one wants to embark on yet another discussion about exceptions;
>>> whether there should be one class for each of the "messages" that exist
>>> in "LocalizedFormats"; whether localization should be done in CM;
>>> etc.]
>>
>> I think it would be best to just eliminate the exceptions.
>
> I'd think that most users of CM should deem that dangerous.
> An exception is relatively difficult to ignore unknowingly (and was
> rightfully a better alternative the old "check the return value").
>
>>>
>>>> Anyways I'm obviously
>>>> interested in playing with this stuff, so when I get something up into
>>>> a repository I'll to do a callback :).
>>>
>>> If you are interested in big overhauls, there is one that gathered
>>> relative consensus: rewrite the algorithms in a "multithread-friendly"
>>> way.
>> I think that's a tall order that will take us into JDK88 :).
>
> That would be a real pity.
> I recall a nit-picking discussion about how to initialize the "FastMath"
> class in order to gain a few _milliseconds_. :-/
>
>> But
>> using callbacks and making potentially long running computations
>> asynchronous could be a middle ground that would allow simple multi
>> threaded use without fiddling around under the hood...
>
> Cf. above (this does not need ad-hoc CM code, beyond the relevant classes
> implementing "Runnable" and/or "Callable").
>
>>> Some ideas were floated (cf. ML archive) but no implementation or
>>> experiment...  Perhaps with a well-defined goal such as performance
>>> improvement, your design suggestions will become clearer to more people.
>>>
>>> AFAIK, only the classes in the "o.a.c.m.neuralnet" package are currently
>>> ready to be used with the "java.util.concurrent" framework.
>> FWIU Neural Nets are a great fit for concurrency.
>
> Quite true.
>
> But even the optimizers could benefit from just being able to use
> more threads: It is often (always?) necessary to evaluate the objective
> function "N" times per iteration. So, the computation could be about
>   min(N, numCores)
> times faster.
>
>> I think for the
>> others we will end up having discussions around how users would
>> control the number of threads, etc. again that makes some of us
>> nervous.
>
> One additional parameters: numCores.
>
>> An asynchronous operation that runs in one separate thread
>> is easier to reason about.
>
> Sure.
>
> But then we should stop talking about performance on this list. ;-}
>
>> If we want to test 10 neural net
>> configurations, and we have 10 cores, then we can start each by itself
>> by doing something like:
>>
>>
>> Nework.setAsync(true).addNeurons().connectNeurons().addObserver(observer).start().
>> //Now do 10 more
>> //If the observer is shared then notifications should be thread safe.
>
> I had a similar argument for not making "FastMath" initialization faster
> (at the cost of a lot of additional code):  It was rejected...
>
> Regards,
> Gilles
>
>
> P.S. I think that several issues evoked in this thread could warrant opening
>      their own thread, to gather more opinions on actual actions to be taken.
>
>
>> Cheers,
>> - Ole
>>
>> P.S. Dang that was a long email.  If I write one more of these, ban me :)
>
> My fault: I should not keep answering! ;-)
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] LeastSquaresOptimizer Design

Luc Maisonobe-2
In reply to this post by ole ersoy
Hi,

Le 2015-09-22 02:55, Ole Ersoy a écrit :

> Hola,
>
> On 09/21/2015 04:15 PM, Gilles wrote:
>> Hi.
>>
>> On Sun, 20 Sep 2015 15:04:08 -0500, Ole Ersoy wrote:
>>> On 09/20/2015 05:51 AM, Gilles wrote:
>>>> On Sun, 20 Sep 2015 01:12:49 -0500, Ole Ersoy wrote:
>>>>> Wanted to float some ideas for the LeastSquaresOptimizer (Possibly
>>>>> General Optimizer) design.  For example with the
>>>>> LevenbergMarquardtOptimizer we would do:
>>>>> `LevenbergMarquardtOptimizer.optimize(OptimizationContext c);`
>>>>>
>>>>> Rough optimize() outline:
>>>>> public static void optimise() {
>>>>> //perform the optimization
>>>>> //If successful
>>>>>     c.notify(LevenberMarquardtResultsEnum.SUCCESS, solution);
>>>>> //If not successful
>>>>>
>>>>>
>>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_COST_RELATIVE_TOLERANCE,
>>>>> diagnostic);
>>>>> //or
>>>>>
>>>>>
>>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_PARAMETERS_RELATIVE_TOLERANCE,
>>>>> diagnostic)
>>>>> //etc
>>>>> }
>>>>>
>>>>> The diagnostic, when turned on, will contain a trace of the last N
>>>>> iterations leading up to the failure.  When turned off, the
>>>>> Diagnostic
>>>>> instance only contains the parameters used to detect failure. The
>>>>> diagnostic could be viewed as an indirect way to log optimizer
>>>>> iterations.
>>>>>
>>>>> WDYT?
>>>>
>>>> I'm wary of having several different ways to convey information to
>>>> the
>>>> caller.
>>> It would just be one way.
>>
>> One way for optimizer, one way for solvers, one way for ...
>
> Yes I see what you mean, but I think on a whole it will be worth it to
> add additional sugar code that removes the need for exceptions.
>
>>
>>> But the caller may not be the receiver
>>> (It could be).  The receiver would be an observer attached to the
>>> OptimizationContext that implements an interface allowing it to
>>> observe
>>> the optimization.
>>
>> I'm afraid that it will add to the questions of what to put in the
>> code and how.  [We already had sometimes heated discussions just for
>> the IMHO obvious (e.g. code formatting, documentation, exception...).]
>
> Hehe.  Yes I remember some of these discussions.  I wonder how much
> time was spent debating the exceptions alone?  Surely everyone must
> have had this feeling in pit of their stomach that there's got to be a
> better way.  On the exception topic, these are some of the issues:
>
> I18N
> ===================
> If you are new to commons math and thinking about designing a commons
> math compatible exception you should probably understand the I18N
> stuff that's bound to exception (and wonder why it's bound the the
> exception).

Not really true. The I18N was really simple at start. See the one from
the Orekit project <https://www.orekit.org/forge/projects/orekit/> which
is
still in this state, and you will see adding a new message can be done
really
simply without a lot of stuff. I18N here is basically one method
(getLocalizedString)
that is never changed in one enumerate class (OrekitMessages), and
adding a message
is only adding one entry to the enumerate, that's all.

The huge pile we have now is only partly related to I18N. The context
stuff
was introduced as an attempt to solve some programmatic retrieval of
information
at catch time, not related to 18N. The huge hierarchy was introduced to
go in a direction were one exception = one type. The ArgUtils was
introduced
due to some Serialization issues. None of this is 18N.

Yes our exception is crap. No I18N is not the only responsible.

> Grab a coffee and spend a few hours, unless you are
> obviously fairly new to Java like some ofthe people posting for help.
> In this case when the exception occurs, there is going to be a lot of
> tutoring going on on the users list.
>
> Number of Exceptions
> ===================
> Before you do actually design a new exception, you should probably see
> if there is an exception that already fits the category of what you

I don't agree. The large list in the enumerate is only a list of
messages for user display. There would really be nothing wrong to add
even more messages, even if they are close to existing ones. In fact, I
even think we should avoid trying to reuse messages by merging them all
in something not meaningful to users (remember, messages are for display
only). Currently we too often get a message like:

   Number is too large, 3 > 2

Nobody understands this at user level. The reused message has lost its
signification. I would really prefer different messages for different
cases
where the fixed part of the format would at least provide some hint to
the
user.

> are doing.  So you start reading.  Exception1...nop
> Exception2...nop...Exception3...Exception999..But I think I'm getting
> warmer.  OK - Did not find it ... but I'm fairly certain that there is
> a elegant place for it somewhere in the exception hierarchy...

I agree. The exception hierarchy is a mess.

>
>
> Handling of Exceptions
> ===================
> If our app uses several of the commons math classes (That throw
> exceptions of the same type), and one of those classes throws an
> exception,what is the app supposed to do?

It depends on the application. Apache Commons Math is a low level
library it is used in many different contexts and there is no single
answer.

>
> I think most developers would find that question somewhat challenging.
>  There are numerous strategies.  Catch all exceptions and log what
> happened, etc.  But what if the requirement is that if an exception is
> thrown, the organization that receives it has 0 seconds to get to the
> root cause of it and understand the dynamics. Is this doable?  (Yes
> obviously, but how hard is it...?).

It is not Apache Commons Math level of decision. We provide the
exception
and users can catch it fast. What users do with it is up to them.

>
>
>>>> It seems that the reporting interfaces could quickly overwhelm
>>>> the "actual" code (one type of context per algorithm).
>>> There would one type of Observer interface per algorithm.  It would
>>> act on the solution and what are currently exceptions, although these
>>> would be translated into enums.
>>
>> Unless I'm mistaken, the most common use-case for codes implemented
>> in a library such as CM is to provide a correct answer or bail out
>> in a non-equivocal way.
> Most java developers are used to synchronous coding...call the method
> get the response...catch the exception if needed.  This is changing
> with JDK8, and as we evolve and start using lambdas, we become more
> accustomed to the functional callback style of programming.
> Personally I want to be able to use an API that gives me what I need
> when everything works as expected, allows me to resolve unexpected
> issues with minimal effort, and is as simple, fluid, and lightweight
> as possible.
>
>>
>> It would make the code more involved to handle a minority of
>> (undefined) cases. [Actual examples would be welcome in order to
>> focus the discussion.]
>
> Rough Outline (I've evolved the concept and moved away from the
> OptimizationContext in the process of writing):
>
> interface LevenbergMarquardtObserver {
>
>     public void hola(Solution s);
>     public void sugarHoneyIceTea(ResultType rt, Dianostics d);
> }
>
> public class LMObserver implements LevenbergMarquardtObserver {
>
>    private Application application;
>
>    public LMObserver(Application application) {
>        this.application = application;
>    }
>
>    public void hola(ResultType rt, Solution s) {
>                 application.next(solution);
>    }
>
>    public void sugarHoneyIceTea(ResultType rt, Diagnostic s)
>        if (rt == ResultType.I_GOT_THIS_ONE) {
>             //I looked at the commons unit tests for this algorithm
> evaluating
>             //the diagnostics that shows how this failure can occur
>             //I'm totally fixing this!  Steps aside!
>        }
>        else if (rt == ResultType.REALLY_COMPLICATED_STUFF)
>        {
>            //We need our best engineers...call India.
>        }
>   )
>
>
> public class Application {
>     //Note nothing is returned.
>     LevenberMarquardtOptimizer.setOberver(new
> LMObserver(this)).setLeastSquaresProblem(new
> ClassThatImplementsTheProblem())).start();
>
>     public void next(Solution solution) {
>
>         //Do cool stuff.
>
>     }
> }
>
> Or an asynchronous variation:
>
> public class Application {
> //This call will not block because async is true
>     LevenberMarquardtOptimizer.setAsync(true).setOberver(new
> LMObserver()).setLeastSquaresProblem(new
> ClassThatImplementsTheProblem())).start();
>
>     //Do more stuff right away.
>
>     public void next(Solution solution) {
>         //When the thread running the optimization is done, this
> method is called back.
>         //Do whatever comes next
>     }
> }
>
> The above would start the optimization in a separate thread that does
> not / SHOULD NOT share data with the main thread.
>
>>
>>>> The current reporting is based on exceptions, and assumes that if no
>>>> exception was thrown, then the user's request completed
>>>> successfully.
>>> Sure - personally I'd much rather deal with something similar to an
>>> HTTP status code in a callback, than an exception .  I think the code
>>> is cleaner and the calback makes it more elegant to apply an adaptive
>>> approach to handling the response, like slightly relaxing
>>> constraints,
>>> convergence parameters, etc.  Also by getting rid of the exceptions,
>>> we no longer depend on the I18N layer that they are tied to and now
>>> the messages can be more informative, since they target the root
>>> cause.  The observer can also run in the 'main' thread' while the
>>> optimization can run asynchronously.  Also WRT JDK9 and modules,
>>> loosing the exceptions would mean one less dependency when the
>>> library
>>> is up into JDK9 modules...which would be more in line with this
>>> philosophy:
>>> https://github.com/substack/browserify-handbook#module-philosophy
>>
>> I'm not sure I fully understood the philosophy from the text in this
>> short paragraph.
>> But I do not agree with the idea that the possibility to quickly find
>> some code is more important than standards and best practices.
>
> If you go to npmjs.org and type in Neural Network you will get 56
> results all linked to github repositories.
>
> In addition there's meta data indicating number of downloads in the
> last day, last month, etc.  Try typing in cosine.  Odds are you will
> find a package that does just want you want and nothing else.  This is
> very underwhelming and refreshing in terms of cloning off of github
> and getting familar with tests etc.  Also eye opening.  How many of us
> knew that we could do that much stuff with cosine! :).
>
>>
>>>> I totally agree that in some circumstances, more information on the
>>>> inner working of an algorithm would be quite useful.
>>> ... Algorithm iterations become unit testable.
>>>>
>>>> But I don't see the point in devoting resources to reinvent the
>>>> wheel:
>>> You mean pimping the wheel?  Big pimpin.
>>
>> I think that logging statements are easy to add, not disruptive at
>> all,
>> and come in handy to understand a code's unexpected behaviour.
>> Assuming that a "logging" feature is useful, it can be added *now*
>> using
>> a dependency towards a weight-less (!) framework such as "slf4j".
>> IMO, it would be a waste of time to implement a new communication
>> layer
>> that can do that, and more, if it would be used for logging only in
>> 99%
>> of the cases.
> SLF4J is used by almost every other framework, so why not use it?
> Logging and the diagnostic could be used together.  The primary
> purpose of the diagnostic though is to collect data that will be
> useful in `sugarHoneyIceTea`.
>
>>
>>>>
>>>> I longed several times for the use of a logging library.
>>>> The only show-stopper has been the informal "no-dependency"
>>>> policy...
>>> JDK9 Jigsaw should solve dependency hell, so the less coupling
>>> between commons math classes the better.
>>
>> I wouldn't call "coupling" the dependency towards exception classes:
>> they are little utilities that can make sense in various parts of the
>> library.
>
> If for example the Simplex solver is broken off into it's own module,
> then it has to be coupled to the exceptions, unless it is exception
> free.
>
>>
>> [Unless one wants to embark on yet another discussion about
>> exceptions;
>> whether there should be one class for each of the "messages" that
>> exist
>> in "LocalizedFormats"; whether localization should be done in CM;
>> etc.]
>
> I think it would be best to just eliminate the exceptions.

NO! A big no!

Apache Commons Math is a very low level library. There are use cases
where you have huge code that relies on math almost everywhere. Look at
the Orekit library for example. We use Vector3D everywhere, we use
Rotation
everywhere, we use ode in many places, we use some linear algebra, we
use
optimizers, we use a few statistics, we use root solvers, we use
derivatives,
we use BSP ... Some of these uses looks as large scale call/return
pattern
where you could ask users to check the result afterwards, but many, many
of
the calls are much lower levels and you have literally several thousands
of
calls to math. Just think about forcing user to check a vector can be
normalized
(i.e. has not 0 norm) everywhere a vector is used. We would end up with
something like:

   Rotation r = new Rotation(a, alpha);
   if (!r.isValid()) {
     // a vector was null
     return error;
  }

Repeat the above 4325 times in your code ... welcome back to the 80's.

Exceptions are meant for, well, exceptional situations. They avoid
people
to cripple the calling code with if(error) statements. They ensure (and
this
is the most important part) that as soon as the error is detected at
very
low level it will be identified and reported back to the upper level
where
it is caught and displayed, without forcing *all* the intermediate
levels
to handle it. When you have complex code with complex algorithms and
several
nested levels of calls in different libraries, maintained by different
teams
and usings tens of thousands calls to math, you don't want a
call/return/check error
type of programming like we used to do in the 80's.

best regards,
Luc

>
>>
>>> Anyways I'm obviously
>>> interested in playing with this stuff, so when I get something up
>>> into
>>> a repository I'll to do a callback :).
>>
>> If you are interested in big overhauls, there is one that gathered
>> relative consensus: rewrite the algorithms in a "multithread-friendly"
>> way.
> I think that's a tall order that will take us into JDK88 :).  But
> using callbacks and making potentially long running computations
> asynchronous could be a middle ground that would allow simple multi
> threaded use without fiddling around under the hood...
>
>>
>> Some ideas were floated (cf. ML archive) but no implementation or
>> experiment...  Perhaps with a well-defined goal such as performance
>> improvement, your design suggestions will become clearer to more
>> people.
>>
>> AFAIK, only the classes in the "o.a.c.m.neuralnet" package are
>> currently
>> ready to be used with the "java.util.concurrent" framework.
> FWIU Neural Nets are a great fit for concurrency.  I think for the
> others we will end up having discussions around how users would
> control the number of threads, etc. again that makes some of us
> nervous.  An asynchronous operation that runs in one separate thread
> is easier to reason about.  If we want to test 10 neural net
> configurations, and we have 10 cores, then we can start each by itself
> by doing something like:
>
> Nework.setAsync(true).addNeurons().connectNeurons().addObserver(observer).start().
> //Now do 10 more
> //If the observer is shared then notifications should be thread safe.
>
> Cheers,
> - Ole
>
> P.S. Dang that was a long email.  If I write one more of these, ban me
> :)
>
>>
>>
>> Best regards,
>> Gilles
>>
>>>
>>> Cheers,
>>> Ole
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] LeastSquaresOptimizer Design

Gilles Sadowski
Hello Luc.

I obviously agree with your main conclusion in that exceptions are
still
a better alternative to what (we think) we understood from Ole's
proposal.

However I don't agree about what is a "mess" on the "exception front"
and what is not, and which part of the library is more to blame for
that. :-)

The terse, composite, messages were a middle ground between two
situations
considered wrong, by one side or the other:
  (1) one exception class per exceptional condition
  (2) one message (to be localized!) per exceptional condition, conveyed
by
      the same exception class

In the same way that some of Ole's proposals can be handled in a layer
above CM, so could the localization of exception messages.

There is no "huge pile".  There was, before, a lot of duplicated code.
The primary rationale of the "ExceptionContext" was for the
localization.
It was then reused to avoid (1).

My main point was that an exception type is much more flexible than a
"String".
Assuming that all applications should terminate with the console
displaying
the error generated by the low-level library is also fairly 80ish. ;-)


Best regards,
Gilles

On Wed, 23 Sep 2015 10:02:30 +0200, luc wrote:

> Hi,
>
> Le 2015-09-22 02:55, Ole Ersoy a écrit :
>> Hola,
>> On 09/21/2015 04:15 PM, Gilles wrote:
>>> Hi.
>>> On Sun, 20 Sep 2015 15:04:08 -0500, Ole Ersoy wrote:
>>>> On 09/20/2015 05:51 AM, Gilles wrote:
>>>>> On Sun, 20 Sep 2015 01:12:49 -0500, Ole Ersoy wrote:
>>>>>> Wanted to float some ideas for the LeastSquaresOptimizer
>>>>>> (Possibly
>>>>>> General Optimizer) design.  For example with the
>>>>>> LevenbergMarquardtOptimizer we would do:
>>>>>> `LevenbergMarquardtOptimizer.optimize(OptimizationContext c);`
>>>>>> Rough optimize() outline:
>>>>>> public static void optimise() {
>>>>>> //perform the optimization
>>>>>> //If successful
>>>>>>     c.notify(LevenberMarquardtResultsEnum.SUCCESS, solution);
>>>>>> //If not successful
>>>>>>
>>>>>>
>>>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_COST_RELATIVE_TOLERANCE,
>>>>>> diagnostic);
>>>>>> //or
>>>>>>
>>>>>>
>>>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_PARAMETERS_RELATIVE_TOLERANCE,
>>>>>> diagnostic)
>>>>>> //etc
>>>>>> }
>>>>>> The diagnostic, when turned on, will contain a trace of the last
>>>>>> N
>>>>>> iterations leading up to the failure.  When turned off, the
>>>>>> Diagnostic
>>>>>> instance only contains the parameters used to detect failure.
>>>>>> The
>>>>>> diagnostic could be viewed as an indirect way to log optimizer
>>>>>> iterations.
>>>>>> WDYT?
>>>>> I'm wary of having several different ways to convey information
>>>>> to the
>>>>> caller.
>>>> It would just be one way.
>>> One way for optimizer, one way for solvers, one way for ...
>> Yes I see what you mean, but I think on a whole it will be worth it
>> to
>> add additional sugar code that removes the need for exceptions.
>>
>>>
>>>> But the caller may not be the receiver
>>>> (It could be).  The receiver would be an observer attached to the
>>>> OptimizationContext that implements an interface allowing it to
>>>> observe
>>>> the optimization.
>>> I'm afraid that it will add to the questions of what to put in the
>>> code and how.  [We already had sometimes heated discussions just
>>> for
>>> the IMHO obvious (e.g. code formatting, documentation,
>>> exception...).]
>> Hehe.  Yes I remember some of these discussions.  I wonder how much
>> time was spent debating the exceptions alone?  Surely everyone must
>> have had this feeling in pit of their stomach that there's got to be
>> a
>> better way.  On the exception topic, these are some of the issues:
>> I18N
>> ===================
>> If you are new to commons math and thinking about designing a
>> commons
>> math compatible exception you should probably understand the I18N
>> stuff that's bound to exception (and wonder why it's bound the the
>> exception).
>
> Not really true. The I18N was really simple at start. See the one
> from
> the Orekit project <https://www.orekit.org/forge/projects/orekit/>
> which is
> still in this state, and you will see adding a new message can be
> done really
> simply without a lot of stuff. I18N here is basically one method
> (getLocalizedString)
> that is never changed in one enumerate class (OrekitMessages), and
> adding a message
> is only adding one entry to the enumerate, that's all.
>
> The huge pile we have now is only partly related to I18N. The context
> stuff
> was introduced as an attempt to solve some programmatic retrieval of
> information
> at catch time, not related to 18N. The huge hierarchy was introduced
> to
> go in a direction were one exception = one type. The ArgUtils was
> introduced
> due to some Serialization issues. None of this is 18N.
>
> Yes our exception is crap. No I18N is not the only responsible.
>
>> Grab a coffee and spend a few hours, unless you are
>> obviously fairly new to Java like some ofthe people posting for
>> help.
>> In this case when the exception occurs, there is going to be a lot
>> of
>> tutoring going on on the users list.
>> Number of Exceptions
>> ===================
>> Before you do actually design a new exception, you should probably
>> see
>> if there is an exception that already fits the category of what you
>
> I don't agree. The large list in the enumerate is only a list of
> messages for user display. There would really be nothing wrong to add
> even more messages, even if they are close to existing ones. In fact,
> I
> even think we should avoid trying to reuse messages by merging them
> all
> in something not meaningful to users (remember, messages are for
> display
> only). Currently we too often get a message like:
>
>   Number is too large, 3 > 2
>
> Nobody understands this at user level. The reused message has lost
> its
> signification. I would really prefer different messages for different
> cases
> where the fixed part of the format would at least provide some hint
> to the
> user.
>
>> are doing.  So you start reading.  Exception1...nop
>> Exception2...nop...Exception3...Exception999..But I think I'm
>> getting
>> warmer.  OK - Did not find it ... but I'm fairly certain that there
>> is
>> a elegant place for it somewhere in the exception hierarchy...
>
> I agree. The exception hierarchy is a mess.
>
>>
>> Handling of Exceptions
>> ===================
>> If our app uses several of the commons math classes (That throw
>> exceptions of the same type), and one of those classes throws an
>> exception,what is the app supposed to do?
>
> It depends on the application. Apache Commons Math is a low level
> library it is used in many different contexts and there is no single
> answer.
>
>> I think most developers would find that question somewhat
>> challenging.
>>  There are numerous strategies.  Catch all exceptions and log what
>> happened, etc.  But what if the requirement is that if an exception
>> is
>> thrown, the organization that receives it has 0 seconds to get to
>> the
>> root cause of it and understand the dynamics. Is this doable?  (Yes
>> obviously, but how hard is it...?).
>
> It is not Apache Commons Math level of decision. We provide the
> exception
> and users can catch it fast. What users do with it is up to them.
>
>>
>>
>>>>> It seems that the reporting interfaces could quickly overwhelm
>>>>> the "actual" code (one type of context per algorithm).
>>>> There would one type of Observer interface per algorithm.  It
>>>> would
>>>> act on the solution and what are currently exceptions, although
>>>> these
>>>> would be translated into enums.
>>> Unless I'm mistaken, the most common use-case for codes implemented
>>> in a library such as CM is to provide a correct answer or bail out
>>> in a non-equivocal way.
>> Most java developers are used to synchronous coding...call the
>> method
>> get the response...catch the exception if needed.  This is changing
>> with JDK8, and as we evolve and start using lambdas, we become more
>> accustomed to the functional callback style of programming.
>> Personally I want to be able to use an API that gives me what I need
>> when everything works as expected, allows me to resolve unexpected
>> issues with minimal effort, and is as simple, fluid, and lightweight
>> as possible.
>>
>>> It would make the code more involved to handle a minority of
>>> (undefined) cases. [Actual examples would be welcome in order to
>>> focus the discussion.]
>> Rough Outline (I've evolved the concept and moved away from the
>> OptimizationContext in the process of writing):
>> interface LevenbergMarquardtObserver {
>> public void hola(Solution s);
>>     public void sugarHoneyIceTea(ResultType rt, Dianostics d);
>> }
>> public class LMObserver implements LevenbergMarquardtObserver {
>> private Application application;
>> public LMObserver(Application application) {
>>        this.application = application;
>>    }
>> public void hola(ResultType rt, Solution s) {
>>                 application.next(solution);
>>    }
>> public void sugarHoneyIceTea(ResultType rt, Diagnostic s)
>>        if (rt == ResultType.I_GOT_THIS_ONE) {
>>             //I looked at the commons unit tests for this algorithm
>> evaluating
>>             //the diagnostics that shows how this failure can occur
>>             //I'm totally fixing this!  Steps aside!
>>        }
>>        else if (rt == ResultType.REALLY_COMPLICATED_STUFF)
>>        {
>>            //We need our best engineers...call India.
>>        }
>>   )
>>
>> public class Application {
>>     //Note nothing is returned.
>>     LevenberMarquardtOptimizer.setOberver(new
>> LMObserver(this)).setLeastSquaresProblem(new
>> ClassThatImplementsTheProblem())).start();
>> public void next(Solution solution) {
>> //Do cool stuff.
>> }
>> }
>> Or an asynchronous variation:
>> public class Application {
>> //This call will not block because async is true
>>     LevenberMarquardtOptimizer.setAsync(true).setOberver(new
>> LMObserver()).setLeastSquaresProblem(new
>> ClassThatImplementsTheProblem())).start();
>> //Do more stuff right away.
>> public void next(Solution solution) {
>>         //When the thread running the optimization is done, this
>> method is called back.
>>         //Do whatever comes next
>>     }
>> }
>> The above would start the optimization in a separate thread that
>> does
>> not / SHOULD NOT share data with the main thread.
>>
>>>
>>>>> The current reporting is based on exceptions, and assumes that if
>>>>> no
>>>>> exception was thrown, then the user's request completed
>>>>> successfully.
>>>> Sure - personally I'd much rather deal with something similar to
>>>> an
>>>> HTTP status code in a callback, than an exception .  I think the
>>>> code
>>>> is cleaner and the calback makes it more elegant to apply an
>>>> adaptive
>>>> approach to handling the response, like slightly relaxing
>>>> constraints,
>>>> convergence parameters, etc.  Also by getting rid of the
>>>> exceptions,
>>>> we no longer depend on the I18N layer that they are tied to and
>>>> now
>>>> the messages can be more informative, since they target the root
>>>> cause.  The observer can also run in the 'main' thread' while the
>>>> optimization can run asynchronously.  Also WRT JDK9 and modules,
>>>> loosing the exceptions would mean one less dependency when the
>>>> library
>>>> is up into JDK9 modules...which would be more in line with this
>>>> philosophy:
>>>> https://github.com/substack/browserify-handbook#module-philosophy
>>> I'm not sure I fully understood the philosophy from the text in
>>> this
>>> short paragraph.
>>> But I do not agree with the idea that the possibility to quickly
>>> find
>>> some code is more important than standards and best practices.
>> If you go to npmjs.org and type in Neural Network you will get 56
>> results all linked to github repositories.
>> In addition there's meta data indicating number of downloads in the
>> last day, last month, etc.  Try typing in cosine.  Odds are you will
>> find a package that does just want you want and nothing else.  This
>> is
>> very underwhelming and refreshing in terms of cloning off of github
>> and getting familar with tests etc.  Also eye opening.  How many of
>> us
>> knew that we could do that much stuff with cosine! :).
>>
>>>
>>>>> I totally agree that in some circumstances, more information on
>>>>> the
>>>>> inner working of an algorithm would be quite useful.
>>>> ... Algorithm iterations become unit testable.
>>>>> But I don't see the point in devoting resources to reinvent the
>>>>> wheel:
>>>> You mean pimping the wheel?  Big pimpin.
>>> I think that logging statements are easy to add, not disruptive at
>>> all,
>>> and come in handy to understand a code's unexpected behaviour.
>>> Assuming that a "logging" feature is useful, it can be added *now*
>>> using
>>> a dependency towards a weight-less (!) framework such as "slf4j".
>>> IMO, it would be a waste of time to implement a new communication
>>> layer
>>> that can do that, and more, if it would be used for logging only in
>>> 99%
>>> of the cases.
>> SLF4J is used by almost every other framework, so why not use it?
>> Logging and the diagnostic could be used together.  The primary
>> purpose of the diagnostic though is to collect data that will be
>> useful in `sugarHoneyIceTea`.
>>
>>>
>>>>> I longed several times for the use of a logging library.
>>>>> The only show-stopper has been the informal "no-dependency"
>>>>> policy...
>>>> JDK9 Jigsaw should solve dependency hell, so the less coupling
>>>> between commons math classes the better.
>>> I wouldn't call "coupling" the dependency towards exception
>>> classes:
>>> they are little utilities that can make sense in various parts of
>>> the
>>> library.
>> If for example the Simplex solver is broken off into it's own
>> module,
>> then it has to be coupled to the exceptions, unless it is exception
>> free.
>>
>>> [Unless one wants to embark on yet another discussion about
>>> exceptions;
>>> whether there should be one class for each of the "messages" that
>>> exist
>>> in "LocalizedFormats"; whether localization should be done in CM;
>>> etc.]
>> I think it would be best to just eliminate the exceptions.
>
> NO! A big no!
>
> Apache Commons Math is a very low level library. There are use cases
> where you have huge code that relies on math almost everywhere. Look
> at
> the Orekit library for example. We use Vector3D everywhere, we use
> Rotation
> everywhere, we use ode in many places, we use some linear algebra, we
> use
> optimizers, we use a few statistics, we use root solvers, we use
> derivatives,
> we use BSP ... Some of these uses looks as large scale call/return
> pattern
> where you could ask users to check the result afterwards, but many,
> many of
> the calls are much lower levels and you have literally several
> thousands of
> calls to math. Just think about forcing user to check a vector can be
> normalized
> (i.e. has not 0 norm) everywhere a vector is used. We would end up
> with
> something like:
>
>   Rotation r = new Rotation(a, alpha);
>   if (!r.isValid()) {
>     // a vector was null
>     return error;
>  }
>
> Repeat the above 4325 times in your code ... welcome back to the
> 80's.
>
> Exceptions are meant for, well, exceptional situations. They avoid
> people
> to cripple the calling code with if(error) statements. They ensure
> (and this
> is the most important part) that as soon as the error is detected at
> very
> low level it will be identified and reported back to the upper level
> where
> it is caught and displayed, without forcing *all* the intermediate
> levels
> to handle it. When you have complex code with complex algorithms and
> several
> nested levels of calls in different libraries, maintained by
> different teams
> and usings tens of thousands calls to math, you don't want a
> call/return/check error
> type of programming like we used to do in the 80's.
>
> best regards,
> Luc
>
>>
>>>
>>>> Anyways I'm obviously
>>>> interested in playing with this stuff, so when I get something up
>>>> into
>>>> a repository I'll to do a callback :).
>>> If you are interested in big overhauls, there is one that gathered
>>> relative consensus: rewrite the algorithms in a
>>> "multithread-friendly"
>>> way.
>> I think that's a tall order that will take us into JDK88 :).  But
>> using callbacks and making potentially long running computations
>> asynchronous could be a middle ground that would allow simple multi
>> threaded use without fiddling around under the hood...
>>
>>> Some ideas were floated (cf. ML archive) but no implementation or
>>> experiment...  Perhaps with a well-defined goal such as performance
>>> improvement, your design suggestions will become clearer to more
>>> people.
>>> AFAIK, only the classes in the "o.a.c.m.neuralnet" package are
>>> currently
>>> ready to be used with the "java.util.concurrent" framework.
>> FWIU Neural Nets are a great fit for concurrency.  I think for the
>> others we will end up having discussions around how users would
>> control the number of threads, etc. again that makes some of us
>> nervous.  An asynchronous operation that runs in one separate thread
>> is easier to reason about.  If we want to test 10 neural net
>> configurations, and we have 10 cores, then we can start each by
>> itself
>> by doing something like:
>>
>> Nework.setAsync(true).addNeurons().connectNeurons().addObserver(observer).start().
>> //Now do 10 more
>> //If the observer is shared then notifications should be thread
>> safe.
>> Cheers,
>> - Ole
>> P.S. Dang that was a long email.  If I write one more of these, ban
>> me :)
>>
>>>
>>> Best regards,
>>> Gilles
>>>
>>>> Cheers,
>>>> Ole


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] LeastSquaresOptimizer Design

ole ersoy
In reply to this post by Luc Maisonobe-2
HI Luc,

On 09/23/2015 03:02 AM, luc wrote:

> Hi,
>
> Le 2015-09-22 02:55, Ole Ersoy a écrit :
>> Hola,
>>
>> On 09/21/2015 04:15 PM, Gilles wrote:
>>> Hi.
>>>
>>> On Sun, 20 Sep 2015 15:04:08 -0500, Ole Ersoy wrote:
>>>> On 09/20/2015 05:51 AM, Gilles wrote:
>>>>> On Sun, 20 Sep 2015 01:12:49 -0500, Ole Ersoy wrote:
>>>>>> Wanted to float some ideas for the LeastSquaresOptimizer (Possibly
>>>>>> General Optimizer) design.  For example with the
>>>>>> LevenbergMarquardtOptimizer we would do:
>>>>>> `LevenbergMarquardtOptimizer.optimize(OptimizationContext c);`
>>>>>>
>>>>>> Rough optimize() outline:
>>>>>> public static void optimise() {
>>>>>> //perform the optimization
>>>>>> //If successful
>>>>>>     c.notify(LevenberMarquardtResultsEnum.SUCCESS, solution);
>>>>>> //If not successful
>>>>>>
>>>>>>
>>>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_COST_RELATIVE_TOLERANCE,
>>>>>> diagnostic);
>>>>>> //or
>>>>>>
>>>>>>
>>>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_PARAMETERS_RELATIVE_TOLERANCE,
>>>>>> diagnostic)
>>>>>> //etc
>>>>>> }
>>>>>>
>>>>>> The diagnostic, when turned on, will contain a trace of the last N
>>>>>> iterations leading up to the failure.  When turned off, the Diagnostic
>>>>>> instance only contains the parameters used to detect failure. The
>>>>>> diagnostic could be viewed as an indirect way to log optimizer
>>>>>> iterations.
>>>>>>
>>>>>> WDYT?
>>>>>
>>>>> I'm wary of having several different ways to convey information to the
>>>>> caller.
>>>> It would just be one way.
>>>
>>> One way for optimizer, one way for solvers, one way for ...
>>
>> Yes I see what you mean, but I think on a whole it will be worth it to
>> add additional sugar code that removes the need for exceptions.
>>
>>>
>>>> But the caller may not be the receiver
>>>> (It could be).  The receiver would be an observer attached to the
>>>> OptimizationContext that implements an interface allowing it to observe
>>>> the optimization.
>>>
>>> I'm afraid that it will add to the questions of what to put in the
>>> code and how.  [We already had sometimes heated discussions just for
>>> the IMHO obvious (e.g. code formatting, documentation, exception...).]
>>
>> Hehe.  Yes I remember some of these discussions.  I wonder how much
>> time was spent debating the exceptions alone?  Surely everyone must
>> have had this feeling in pit of their stomach that there's got to be a
>> better way.  On the exception topic, these are some of the issues:
>>
>> I18N
>> ===================
>> If you are new to commons math and thinking about designing a commons
>> math compatible exception you should probably understand the I18N
>> stuff that's bound to exception (and wonder why it's bound the the
>> exception).
>
> Not really true.
Well a lot of things are gray.  Personally if I'm dealing with an API, I like to understand it, so that there are no surprises.  And I understand that the I18N coupling might not force me to use it, but if I want to be smart about my architecture, and simplify my design, then I should look at it.  Maybe it is a good idea.  Maybe I should just gloss over it?  Am I being sloppy if I just gloss over it?

Or is there an alternative that provides the same functionality, or maybe something better, that does not come with any of these side effects?

> The I18N was really simple at start.

Yup I reviewed it and thought - it's probably no big deal - but as I started looking into reusing the CM exceptions, I decided that it was not worth the complexity.  I think that if I throw and exception it is my responsibility to make sure that whomever receives it has a very simple 0 minute or minimal resolution time for dealing with it, so that when the code is handed over the client feels confident.  It should work like a fridge.  We leave the door open too long, a bell goes off, we close the door.

> See the one from
> the Orekit project <https://www.orekit.org/forge/projects/orekit/> which is
> still in this state, and you will see adding a new message can be done really
> simply without a lot of stuff.

Simple is a relative term.  But really simple is different from simple.  Really simple would be the LevenbergMarquardtOptimizer in it's own module, separated from everything else, with a minimal set of dependencies.  If it's not this simple, then it quickly grows more complex as we scale.

> I18N here is basically one method (getLocalizedString)
> that is never changed in one enumerate class (OrekitMessages), and adding a message
> is only adding one entry to the enumerate, that's all.
Sure and when a developer sees that one method, and they know how sharp the Apache people are (Sincerely), they start looking into reusing this design.  So if we are going to send them down that path, then that should be the best path.

>
> The huge pile we have now is only partly related to I18N.
That's what I mean by scaling.  It's like the princess with the pea below the bottom mattress.  She keeps throwing mattresses on top, and the pea is still bothering her.  So after the 10th mattress, she decides they all have to come down until she can find out what's going on.  All 10 mattresses have to come back off.

> The context stuff
> was introduced as an attempt to solve some programmatic retrieval of information
> at catch time, not related to 18N.
A callback would:
- Provide the same context through a diagnostic
- Decouple from I18N, but still provide the means to retrieve a message.
- Be more tightly coupled to the root cause of the exception / error

> The huge hierarchy was introduced to
> go in a direction were one exception = one type.
That's what I was hoping would lead to `one type` equalling the precise root cause of the exception, but this is not true.  An exception can be reused in multiple places with multiple different root causes.

> The ArgUtils was introduced
> due to some Serialization issues. None of this is 18N.
So if we look at this through the lens of:
- Core developer productivity
- API user productivity

Are these helping?

>
> Yes our exception is crap. No I18N is not the only responsible.
Honestly until I started reading NodeJS code (About six months ago) I was super happy with it, and thought that it was this way, because it's the only way, so that's what we get.  But now I think we can do much better.

>
>> Grab a coffee and spend a few hours, unless you are
>> obviously fairly new to Java like some ofthe people posting for help.
>> In this case when the exception occurs, there is going to be a lot of
>> tutoring going on on the users list.
>>
>> Number of Exceptions
>> ===================
>> Before you do actually design a new exception, you should probably see
>> if there is an exception that already fits the category of what you
>
> I don't agree.
So you don't think someone should look to reuse an existing exception...because that would lead to a lot more new exceptions?

> The large list in the enumerate is only a list of
> messages for user display.
The exceptions are one thing and user display is another, and should we be mixing these?  On the one hand we are saying the CM is a low level library and that the developer should catch the exceptions, and instruct the client on how to handle them, and now we are saying that there are message for user display...

And this is sort of a gray area, so I don't mean to nit pick it so much, but there's is a cleaner way to do this, that does not mix the user display concept with the exception / reporting of something different than a solution.

> There would really be nothing wrong to add
> even more messages,even if they are close to existing ones.
What if they are exactly the same as the existing ones...because we were too lazy to scan?  And there's a simple programmatic solution to that, but do we really want this workflow?

> In fact, I
> even think we should avoid trying to reuse messages by merging them all
> in something not meaningful to users (remember, messages are for display
> only). Currently we too often get a message like:
>
>   Number is too large, 3 > 2
>
> Nobody understands this at user level.
If instead there's an Enum, tied to the corresponding class, that represents this condition at the line where it happens, then we can leave it up to the developer to craft a message that will serve the client best.

And I'm not saying that there should not be messages.  Just that they should be isolated to a specific context, maintained within the parameters of that context, and looked up by a single unique key only.


The reused message has lost its
> signification. I would really prefer different messages for different cases
> where the fixed part of the format would at least provide some hint to the
> user.

Yes I think we are saying the same thing here.

>
>
>> are doing.  So you start reading. Exception1...nop
>> Exception2...nop...Exception3...Exception999..But I think I'm getting
>> warmer.  OK - Did not find it ... but I'm fairly certain that there is
>> a elegant place for it somewhere in the exception hierarchy...
>
> I agree. The exception hierarchy is a mess.

And in all fairness it seemed really elegant at first, and we all went with it.

>
>>
>>
>> Handling of Exceptions
>> ===================
>> If our app uses several of the commons math classes (That throw
>> exceptions of the same type), and one of those classes throws an
>> exception,what is the app supposed to do?
>
> It depends on the application.

But it's possible to architect CM so that the case is that the developer is led down one road only.  And it's like a Hyperloop. They get in, and 7 minutes later, they are sipping a Mojito in Cancun :).

> Apache Commons Math is a low level
> library it is used in many different contexts and there is no single
> answer.

Unless there is.

>
>
>>
>> I think most developers would find that question somewhat challenging.
>>  There are numerous strategies.  Catch all exceptions and log what
>> happened, etc.  But what if the requirement is that if an exception is
>> thrown, the organization that receives it has 0 seconds to get to the
>> root cause of it and understand the dynamics. Is this doable? (Yes
>> obviously, but how hard is it...?).
>
> It is not Apache Commons Math level of decision.
Not right now, but it could be.

> We provide the exception
> and users can catch it fast. What users do with it is up to them.

Sure.  That's how it works right now, but this is causing both more CM core developer overhead and API user overhead than there would be if exceptions are replaced with Enums and throwing exceptions is replaced with a callback.

>
>
>>
>>
>>>>> It seems that the reporting interfaces could quickly overwhelm
>>>>> the "actual" code (one type of context per algorithm).
>>>> There would one type of Observer interface per algorithm. It would
>>>> act on the solution and what are currently exceptions, although these
>>>> would be translated into enums.
>>>
>>> Unless I'm mistaken, the most common use-case for codes implemented
>>> in a library such as CM is to provide a correct answer or bail out
>>> in a non-equivocal way.
>> Most java developers are used to synchronous coding...call the method
>> get the response...catch the exception if needed.  This is changing
>> with JDK8, and as we evolve and start using lambdas, we become more
>> accustomed to the functional callback style of programming.
>> Personally I want to be able to use an API that gives me what I need
>> when everything works as expected, allows me to resolve unexpected
>> issues with minimal effort, and is as simple, fluid, and lightweight
>> as possible.
>>
>>>
>>> It would make the code more involved to handle a minority of
>>> (undefined) cases. [Actual examples would be welcome in order to
>>> focus the discussion.]
>>
>> Rough Outline (I've evolved the concept and moved away from the
>> OptimizationContext in the process of writing):
>>
>> interface LevenbergMarquardtObserver {
>>
>>     public void hola(Solution s);
>>     public void sugarHoneyIceTea(ResultType rt, Dianostics d);
>> }
>>
>> public class LMObserver implements LevenbergMarquardtObserver {
>>
>>    private Application application;
>>
>>    public LMObserver(Application application) {
>>        this.application = application;
>>    }
>>
>>    public void hola(ResultType rt, Solution s) {
>>                 application.next(solution);
>>    }
>>
>>    public void sugarHoneyIceTea(ResultType rt, Diagnostic s)
>>        if (rt == ResultType.I_GOT_THIS_ONE) {
>>             //I looked at the commons unit tests for this algorithm evaluating
>>             //the diagnostics that shows how this failure can occur
>>             //I'm totally fixing this!  Steps aside!
>>        }
>>        else if (rt == ResultType.REALLY_COMPLICATED_STUFF)
>>        {
>>            //We need our best engineers...call India.
>>        }
>>   )
>>
>>
>> public class Application {
>>     //Note nothing is returned.
>>     LevenberMarquardtOptimizer.setOberver(new
>> LMObserver(this)).setLeastSquaresProblem(new
>> ClassThatImplementsTheProblem())).start();
>>
>>     public void next(Solution solution) {
>>
>>         //Do cool stuff.
>>
>>     }
>> }
>>
>> Or an asynchronous variation:
>>
>> public class Application {
>> //This call will not block because async is true
>>     LevenberMarquardtOptimizer.setAsync(true).setOberver(new
>> LMObserver()).setLeastSquaresProblem(new
>> ClassThatImplementsTheProblem())).start();
>>
>>     //Do more stuff right away.
>>
>>     public void next(Solution solution) {
>>         //When the thread running the optimization is done, this
>> method is called back.
>>         //Do whatever comes next
>>     }
>> }
>>
>> The above would start the optimization in a separate thread that does
>> not / SHOULD NOT share data with the main thread.
>>
>>>
>>>>> The current reporting is based on exceptions, and assumes that if no
>>>>> exception was thrown, then the user's request completed successfully.
>>>> Sure - personally I'd much rather deal with something similar to an
>>>> HTTP status code in a callback, than an exception .  I think the code
>>>> is cleaner and the calback makes it more elegant to apply an adaptive
>>>> approach to handling the response, like slightly relaxing constraints,
>>>> convergence parameters, etc.  Also by getting rid of the exceptions,
>>>> we no longer depend on the I18N layer that they are tied to and now
>>>> the messages can be more informative, since they target the root
>>>> cause.  The observer can also run in the 'main' thread' while the
>>>> optimization can run asynchronously.  Also WRT JDK9 and modules,
>>>> loosing the exceptions would mean one less dependency when the library
>>>> is up into JDK9 modules...which would be more in line with this
>>>> philosophy:
>>>> https://github.com/substack/browserify-handbook#module-philosophy
>>>
>>> I'm not sure I fully understood the philosophy from the text in this
>>> short paragraph.
>>> But I do not agree with the idea that the possibility to quickly find
>>> some code is more important than standards and best practices.
>>
>> If you go to npmjs.org and type in Neural Network you will get 56
>> results all linked to github repositories.
>>
>> In addition there's meta data indicating number of downloads in the
>> last day, last month, etc.  Try typing in cosine.  Odds are you will
>> find a package that does just want you want and nothing else. This is
>> very underwhelming and refreshing in terms of cloning off of github
>> and getting familar with tests etc.  Also eye opening.  How many of us
>> knew that we could do that much stuff with cosine! :).
>>
>>>
>>>>> I totally agree that in some circumstances, more information on the
>>>>> inner working of an algorithm would be quite useful.
>>>> ... Algorithm iterations become unit testable.
>>>>>
>>>>> But I don't see the point in devoting resources to reinvent the wheel:
>>>> You mean pimping the wheel?  Big pimpin.
>>>
>>> I think that logging statements are easy to add, not disruptive at all,
>>> and come in handy to understand a code's unexpected behaviour.
>>> Assuming that a "logging" feature is useful, it can be added *now* using
>>> a dependency towards a weight-less (!) framework such as "slf4j".
>>> IMO, it would be a waste of time to implement a new communication layer
>>> that can do that, and more, if it would be used for logging only in 99%
>>> of the cases.
>> SLF4J is used by almost every other framework, so why not use it?
>> Logging and the diagnostic could be used together.  The primary
>> purpose of the diagnostic though is to collect data that will be
>> useful in `sugarHoneyIceTea`.
>>
>>>
>>>>>
>>>>> I longed several times for the use of a logging library.
>>>>> The only show-stopper has been the informal "no-dependency" policy...
>>>> JDK9 Jigsaw should solve dependency hell, so the less coupling
>>>> between commons math classes the better.
>>>
>>> I wouldn't call "coupling" the dependency towards exception classes:
>>> they are little utilities that can make sense in various parts of the
>>> library.
>>
>> If for example the Simplex solver is broken off into it's own module,
>> then it has to be coupled to the exceptions, unless it is exception
>> free.
>>
>>>
>>> [Unless one wants to embark on yet another discussion about exceptions;
>>> whether there should be one class for each of the "messages" that exist
>>> in "LocalizedFormats"; whether localization should be done in CM;
>>> etc.]
>>
>> I think it would be best to just eliminate the exceptions.
>
> NO! A big no!
Before the NO is that big, how about examining just a few simple cases where it's done?  You might come around to liking it.  Also it does not need to be a all or nothing.  Some modules could be exception free, whereas others, could throw exceptions because it might be the simplest thing to do and what people are used to.
>
> Apache Commons Math is a very low level library. There are use cases
> where you have huge code that relies on math almost everywhere.
True dat. But it's quite possible that some of the math that is being used does not actually need to throw the exception.  There are multiple exception categories:

- Exceptions thrown by invalid input parameters
- Exceptions that signal that either something completely unexpected happened (Even we are like WTF?)
- Exceptions that could happen in certain edge conditions that we understand and know how to deal with

The first one we can eliminate by supplying a self validating object to the routine.  The next two are the ones that it might really be helpful they are communicated as Enums via a callback.

It could be reviewed on a case by case basis, in the a process that considers moving to JDK9 modules.

> Look at
> the Orekit library for example. We use Vector3D everywhere, we use Rotation
> everywhere, we use ode in many places, we use some linear algebra, we use
> optimizers, we use a few statistics, we use root solvers, we use derivatives,
> we use BSP

No problem just get rid of the 3D rotation, kill the Vector3D, forget statistics (Way too complicated!), and it's all good :). Kidding obviously, but this does highlight what I'm talking about. If there are tons of CM classes being utilities within a single method, and all of them can throw generic exceptions that cut across multiple classes, the developer has to decode the exception, causing her to waste time.


> ... Some of these uses looks as large scale call/return pattern
> where you could ask users to check the result afterwards, but many, many of
> the calls are much lower levels and you have literally several thousands of
> calls to math.
That's all fine.  CM will still have the exact same workings - it will just be more efficient.

> Just think about forcing user to check a vector can be normalized
> (i.e. has not 0 norm) everywhere a vector is used. We would end up with
> something like:

>
>   Rotation r = new Rotation(a, alpha);
>   if (!r.isValid()) {
>     // a vector was null
>     return error;
>  }

In this case the client developer could have made sure the vector was not null and valid prior to passing it to a routine.  It's possible that CM will be more lightweight, efficient, and friendly if API users are told that it's their responsibility to validate parameters.  So:

RotationContext r = new RotationContext(a, alpha);
if (!r.isValid()) {
     // Dag Nab It!!!
}
else {
    //Rotate away
    Rotation rotation = new Rotation(rotationContext);
}

Or if we know that the parameters are valid (Mass validation):
Rotation rotation = new Rotation(a, alpha);
>
> Repeat the above 4325 times in your code ... welcome back to the 80's.

We only have to repeat the validation check 4325 times if there is actually a possibility that the vector is invalid.

>
> Exceptions are meant for, well, exceptional situations.
These are exceptional times :)
> They avoid people
> to cripple the calling code with if(error) statements.
We need if(error) at some point.  I agree that if(error) should be minimized.

> They ensure (and this
> is the most important part) that as soon as the error is detected at very
> low level it will be identified and reported back to the upper level where
> it is caught and displayed, without forcing *all* the intermediate levels
> to handle it.

Yes but this causes the "Root cause analysis deciphering effect" I've been talking about.

> When you have complex code with complex algorithms and several
> nested levels of calls in different libraries, maintained by different teams
> and usings tens of thousands calls to math, you don't want a call/return/check error
> type of programming like we used to do in the 80's.
Ahh the Commodore 64 days...I miss those days...Anyways - no one wants to be hung up on some clown from the 80s.  Right now we are hung up on a clown from the 90s though!  We need a Kim Kardashian API. :)

Cheers,
- Ole


>
> best regards,
> Luc
>
>>
>>>
>>>> Anyways I'm obviously
>>>> interested in playing with this stuff, so when I get something up into
>>>> a repository I'll to do a callback :).
>>>
>>> If you are interested in big overhauls, there is one that gathered
>>> relative consensus: rewrite the algorithms in a "multithread-friendly"
>>> way.
>> I think that's a tall order that will take us into JDK88 :). But
>> using callbacks and making potentially long running computations
>> asynchronous could be a middle ground that would allow simple multi
>> threaded use without fiddling around under the hood...
>>
>>>
>>> Some ideas were floated (cf. ML archive) but no implementation or
>>> experiment...  Perhaps with a well-defined goal such as performance
>>> improvement, your design suggestions will become clearer to more people.
>>>
>>> AFAIK, only the classes in the "o.a.c.m.neuralnet" package are currently
>>> ready to be used with the "java.util.concurrent" framework.
>> FWIU Neural Nets are a great fit for concurrency.  I think for the
>> others we will end up having discussions around how users would
>> control the number of threads, etc. again that makes some of us
>> nervous.  An asynchronous operation that runs in one separate thread
>> is easier to reason about.  If we want to test 10 neural net
>> configurations, and we have 10 cores, then we can start each by itself
>> by doing something like:
>>
>> Nework.setAsync(true).addNeurons().connectNeurons().addObserver(observer).start().
>> //Now do 10 more
>> //If the observer is shared then notifications should be thread safe.
>>
>> Cheers,
>> - Ole
>>
>> P.S. Dang that was a long email.  If I write one more of these, ban me :)
>>
>>>
>>>
>>> Best regards,
>>> Gilles
>>>
>>>>
>>>> Cheers,
>>>> Ole
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] LeastSquaresOptimizer Design

ole ersoy
In reply to this post by Luc Maisonobe-2
Luc,

Just wanted to mention one more thing (On top of the other 325 :) ).  The callback design does not bubble exceptions, but we can still get the same effect, and do better.  For the app we would define a global error handler and make that handler part of each callback. So if are using:

Foo.class
Boo.class
Yao.class

That all all utilize callbacks and each have their own unique set of Enums that specify the errors that each class emanates / signals, then the callback would delegate these to the global error handler. So effectively we are getting exception bubbling, without the ambiguity that comes from having shared exceptions across classes.

The global error handler:
1) Gets an error and deals with it
2) Gets and error and signals the user, possibly looking up a message using the ErrorEnum.

Cheers,
- Ole

On 09/23/2015 03:02 AM, luc wrote:

> Hi,
>
> Le 2015-09-22 02:55, Ole Ersoy a écrit :
>> Hola,
>>
>> On 09/21/2015 04:15 PM, Gilles wrote:
>>> Hi.
>>>
>>> On Sun, 20 Sep 2015 15:04:08 -0500, Ole Ersoy wrote:
>>>> On 09/20/2015 05:51 AM, Gilles wrote:
>>>>> On Sun, 20 Sep 2015 01:12:49 -0500, Ole Ersoy wrote:
>>>>>> Wanted to float some ideas for the LeastSquaresOptimizer (Possibly
>>>>>> General Optimizer) design.  For example with the
>>>>>> LevenbergMarquardtOptimizer we would do:
>>>>>> `LevenbergMarquardtOptimizer.optimize(OptimizationContext c);`
>>>>>>
>>>>>> Rough optimize() outline:
>>>>>> public static void optimise() {
>>>>>> //perform the optimization
>>>>>> //If successful
>>>>>>     c.notify(LevenberMarquardtResultsEnum.SUCCESS, solution);
>>>>>> //If not successful
>>>>>>
>>>>>>
>>>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_COST_RELATIVE_TOLERANCE,
>>>>>> diagnostic);
>>>>>> //or
>>>>>>
>>>>>>
>>>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_PARAMETERS_RELATIVE_TOLERANCE,
>>>>>> diagnostic)
>>>>>> //etc
>>>>>> }
>>>>>>
>>>>>> The diagnostic, when turned on, will contain a trace of the last N
>>>>>> iterations leading up to the failure.  When turned off, the Diagnostic
>>>>>> instance only contains the parameters used to detect failure. The
>>>>>> diagnostic could be viewed as an indirect way to log optimizer
>>>>>> iterations.
>>>>>>
>>>>>> WDYT?
>>>>>
>>>>> I'm wary of having several different ways to convey information to the
>>>>> caller.
>>>> It would just be one way.
>>>
>>> One way for optimizer, one way for solvers, one way for ...
>>
>> Yes I see what you mean, but I think on a whole it will be worth it to
>> add additional sugar code that removes the need for exceptions.
>>
>>>
>>>> But the caller may not be the receiver
>>>> (It could be).  The receiver would be an observer attached to the
>>>> OptimizationContext that implements an interface allowing it to observe
>>>> the optimization.
>>>
>>> I'm afraid that it will add to the questions of what to put in the
>>> code and how.  [We already had sometimes heated discussions just for
>>> the IMHO obvious (e.g. code formatting, documentation, exception...).]
>>
>> Hehe.  Yes I remember some of these discussions.  I wonder how much
>> time was spent debating the exceptions alone?  Surely everyone must
>> have had this feeling in pit of their stomach that there's got to be a
>> better way.  On the exception topic, these are some of the issues:
>>
>> I18N
>> ===================
>> If you are new to commons math and thinking about designing a commons
>> math compatible exception you should probably understand the I18N
>> stuff that's bound to exception (and wonder why it's bound the the
>> exception).
>
> Not really true. The I18N was really simple at start. See the one from
> the Orekit project <https://www.orekit.org/forge/projects/orekit/> which is
> still in this state, and you will see adding a new message can be done really
> simply without a lot of stuff. I18N here is basically one method (getLocalizedString)
> that is never changed in one enumerate class (OrekitMessages), and adding a message
> is only adding one entry to the enumerate, that's all.
>
> The huge pile we have now is only partly related to I18N. The context stuff
> was introduced as an attempt to solve some programmatic retrieval of information
> at catch time, not related to 18N. The huge hierarchy was introduced to
> go in a direction were one exception = one type. The ArgUtils was introduced
> due to some Serialization issues. None of this is 18N.
>
> Yes our exception is crap. No I18N is not the only responsible.
>
>> Grab a coffee and spend a few hours, unless you are
>> obviously fairly new to Java like some ofthe people posting for help.
>> In this case when the exception occurs, there is going to be a lot of
>> tutoring going on on the users list.
>>
>> Number of Exceptions
>> ===================
>> Before you do actually design a new exception, you should probably see
>> if there is an exception that already fits the category of what you
>
> I don't agree. The large list in the enumerate is only a list of
> messages for user display. There would really be nothing wrong to add
> even more messages, even if they are close to existing ones. In fact, I
> even think we should avoid trying to reuse messages by merging them all
> in something not meaningful to users (remember, messages are for display
> only). Currently we too often get a message like:
>
>   Number is too large, 3 > 2
>
> Nobody understands this at user level. The reused message has lost its
> signification. I would really prefer different messages for different cases
> where the fixed part of the format would at least provide some hint to the
> user.
>
>> are doing.  So you start reading. Exception1...nop
>> Exception2...nop...Exception3...Exception999..But I think I'm getting
>> warmer.  OK - Did not find it ... but I'm fairly certain that there is
>> a elegant place for it somewhere in the exception hierarchy...
>
> I agree. The exception hierarchy is a mess.
>
>>
>>
>> Handling of Exceptions
>> ===================
>> If our app uses several of the commons math classes (That throw
>> exceptions of the same type), and one of those classes throws an
>> exception,what is the app supposed to do?
>
> It depends on the application. Apache Commons Math is a low level
> library it is used in many different contexts and there is no single
> answer.
>
>>
>> I think most developers would find that question somewhat challenging.
>>  There are numerous strategies.  Catch all exceptions and log what
>> happened, etc.  But what if the requirement is that if an exception is
>> thrown, the organization that receives it has 0 seconds to get to the
>> root cause of it and understand the dynamics. Is this doable? (Yes
>> obviously, but how hard is it...?).
>
> It is not Apache Commons Math level of decision. We provide the exception
> and users can catch it fast. What users do with it is up to them.
>
>>
>>
>>>>> It seems that the reporting interfaces could quickly overwhelm
>>>>> the "actual" code (one type of context per algorithm).
>>>> There would one type of Observer interface per algorithm. It would
>>>> act on the solution and what are currently exceptions, although these
>>>> would be translated into enums.
>>>
>>> Unless I'm mistaken, the most common use-case for codes implemented
>>> in a library such as CM is to provide a correct answer or bail out
>>> in a non-equivocal way.
>> Most java developers are used to synchronous coding...call the method
>> get the response...catch the exception if needed.  This is changing
>> with JDK8, and as we evolve and start using lambdas, we become more
>> accustomed to the functional callback style of programming.
>> Personally I want to be able to use an API that gives me what I need
>> when everything works as expected, allows me to resolve unexpected
>> issues with minimal effort, and is as simple, fluid, and lightweight
>> as possible.
>>
>>>
>>> It would make the code more involved to handle a minority of
>>> (undefined) cases. [Actual examples would be welcome in order to
>>> focus the discussion.]
>>
>> Rough Outline (I've evolved the concept and moved away from the
>> OptimizationContext in the process of writing):
>>
>> interface LevenbergMarquardtObserver {
>>
>>     public void hola(Solution s);
>>     public void sugarHoneyIceTea(ResultType rt, Dianostics d);
>> }
>>
>> public class LMObserver implements LevenbergMarquardtObserver {
>>
>>    private Application application;
>>
>>    public LMObserver(Application application) {
>>        this.application = application;
>>    }
>>
>>    public void hola(ResultType rt, Solution s) {
>>                 application.next(solution);
>>    }
>>
>>    public void sugarHoneyIceTea(ResultType rt, Diagnostic s)
>>        if (rt == ResultType.I_GOT_THIS_ONE) {
>>             //I looked at the commons unit tests for this algorithm evaluating
>>             //the diagnostics that shows how this failure can occur
>>             //I'm totally fixing this!  Steps aside!
>>        }
>>        else if (rt == ResultType.REALLY_COMPLICATED_STUFF)
>>        {
>>            //We need our best engineers...call India.
>>        }
>>   )
>>
>>
>> public class Application {
>>     //Note nothing is returned.
>>     LevenberMarquardtOptimizer.setOberver(new
>> LMObserver(this)).setLeastSquaresProblem(new
>> ClassThatImplementsTheProblem())).start();
>>
>>     public void next(Solution solution) {
>>
>>         //Do cool stuff.
>>
>>     }
>> }
>>
>> Or an asynchronous variation:
>>
>> public class Application {
>> //This call will not block because async is true
>>     LevenberMarquardtOptimizer.setAsync(true).setOberver(new
>> LMObserver()).setLeastSquaresProblem(new
>> ClassThatImplementsTheProblem())).start();
>>
>>     //Do more stuff right away.
>>
>>     public void next(Solution solution) {
>>         //When the thread running the optimization is done, this
>> method is called back.
>>         //Do whatever comes next
>>     }
>> }
>>
>> The above would start the optimization in a separate thread that does
>> not / SHOULD NOT share data with the main thread.
>>
>>>
>>>>> The current reporting is based on exceptions, and assumes that if no
>>>>> exception was thrown, then the user's request completed successfully.
>>>> Sure - personally I'd much rather deal with something similar to an
>>>> HTTP status code in a callback, than an exception .  I think the code
>>>> is cleaner and the calback makes it more elegant to apply an adaptive
>>>> approach to handling the response, like slightly relaxing constraints,
>>>> convergence parameters, etc.  Also by getting rid of the exceptions,
>>>> we no longer depend on the I18N layer that they are tied to and now
>>>> the messages can be more informative, since they target the root
>>>> cause.  The observer can also run in the 'main' thread' while the
>>>> optimization can run asynchronously.  Also WRT JDK9 and modules,
>>>> loosing the exceptions would mean one less dependency when the library
>>>> is up into JDK9 modules...which would be more in line with this
>>>> philosophy:
>>>> https://github.com/substack/browserify-handbook#module-philosophy
>>>
>>> I'm not sure I fully understood the philosophy from the text in this
>>> short paragraph.
>>> But I do not agree with the idea that the possibility to quickly find
>>> some code is more important than standards and best practices.
>>
>> If you go to npmjs.org and type in Neural Network you will get 56
>> results all linked to github repositories.
>>
>> In addition there's meta data indicating number of downloads in the
>> last day, last month, etc.  Try typing in cosine.  Odds are you will
>> find a package that does just want you want and nothing else. This is
>> very underwhelming and refreshing in terms of cloning off of github
>> and getting familar with tests etc.  Also eye opening.  How many of us
>> knew that we could do that much stuff with cosine! :).
>>
>>>
>>>>> I totally agree that in some circumstances, more information on the
>>>>> inner working of an algorithm would be quite useful.
>>>> ... Algorithm iterations become unit testable.
>>>>>
>>>>> But I don't see the point in devoting resources to reinvent the wheel:
>>>> You mean pimping the wheel?  Big pimpin.
>>>
>>> I think that logging statements are easy to add, not disruptive at all,
>>> and come in handy to understand a code's unexpected behaviour.
>>> Assuming that a "logging" feature is useful, it can be added *now* using
>>> a dependency towards a weight-less (!) framework such as "slf4j".
>>> IMO, it would be a waste of time to implement a new communication layer
>>> that can do that, and more, if it would be used for logging only in 99%
>>> of the cases.
>> SLF4J is used by almost every other framework, so why not use it?
>> Logging and the diagnostic could be used together.  The primary
>> purpose of the diagnostic though is to collect data that will be
>> useful in `sugarHoneyIceTea`.
>>
>>>
>>>>>
>>>>> I longed several times for the use of a logging library.
>>>>> The only show-stopper has been the informal "no-dependency" policy...
>>>> JDK9 Jigsaw should solve dependency hell, so the less coupling
>>>> between commons math classes the better.
>>>
>>> I wouldn't call "coupling" the dependency towards exception classes:
>>> they are little utilities that can make sense in various parts of the
>>> library.
>>
>> If for example the Simplex solver is broken off into it's own module,
>> then it has to be coupled to the exceptions, unless it is exception
>> free.
>>
>>>
>>> [Unless one wants to embark on yet another discussion about exceptions;
>>> whether there should be one class for each of the "messages" that exist
>>> in "LocalizedFormats"; whether localization should be done in CM;
>>> etc.]
>>
>> I think it would be best to just eliminate the exceptions.
>
> NO! A big no!
>
> Apache Commons Math is a very low level library. There are use cases
> where you have huge code that relies on math almost everywhere. Look at
> the Orekit library for example. We use Vector3D everywhere, we use Rotation
> everywhere, we use ode in many places, we use some linear algebra, we use
> optimizers, we use a few statistics, we use root solvers, we use derivatives,
> we use BSP ... Some of these uses looks as large scale call/return pattern
> where you could ask users to check the result afterwards, but many, many of
> the calls are much lower levels and you have literally several thousands of
> calls to math. Just think about forcing user to check a vector can be normalized
> (i.e. has not 0 norm) everywhere a vector is used. We would end up with
> something like:
>
>   Rotation r = new Rotation(a, alpha);
>   if (!r.isValid()) {
>     // a vector was null
>     return error;
>  }
>
> Repeat the above 4325 times in your code ... welcome back to the 80's.
>
> Exceptions are meant for, well, exceptional situations. They avoid people
> to cripple the calling code with if(error) statements. They ensure (and this
> is the most important part) that as soon as the error is detected at very
> low level it will be identified and reported back to the upper level where
> it is caught and displayed, without forcing *all* the intermediate levels
> to handle it. When you have complex code with complex algorithms and several
> nested levels of calls in different libraries, maintained by different teams
> and usings tens of thousands calls to math, you don't want a call/return/check error
> type of programming like we used to do in the 80's.
>
> best regards,
> Luc
>
>>
>>>
>>>> Anyways I'm obviously
>>>> interested in playing with this stuff, so when I get something up into
>>>> a repository I'll to do a callback :).
>>>
>>> If you are interested in big overhauls, there is one that gathered
>>> relative consensus: rewrite the algorithms in a "multithread-friendly"
>>> way.
>> I think that's a tall order that will take us into JDK88 :). But
>> using callbacks and making potentially long running computations
>> asynchronous could be a middle ground that would allow simple multi
>> threaded use without fiddling around under the hood...
>>
>>>
>>> Some ideas were floated (cf. ML archive) but no implementation or
>>> experiment...  Perhaps with a well-defined goal such as performance
>>> improvement, your design suggestions will become clearer to more people.
>>>
>>> AFAIK, only the classes in the "o.a.c.m.neuralnet" package are currently
>>> ready to be used with the "java.util.concurrent" framework.
>> FWIU Neural Nets are a great fit for concurrency.  I think for the
>> others we will end up having discussions around how users would
>> control the number of threads, etc. again that makes some of us
>> nervous.  An asynchronous operation that runs in one separate thread
>> is easier to reason about.  If we want to test 10 neural net
>> configurations, and we have 10 cores, then we can start each by itself
>> by doing something like:
>>
>> Nework.setAsync(true).addNeurons().connectNeurons().addObserver(observer).start().
>> //Now do 10 more
>> //If the observer is shared then notifications should be thread safe.
>>
>> Cheers,
>> - Ole
>>
>> P.S. Dang that was a long email.  If I write one more of these, ban me :)
>>
>>>
>>>
>>> Best regards,
>>> Gilles
>>>
>>>>
>>>> Cheers,
>>>> Ole
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] LeastSquaresOptimizer Design

Luc Maisonobe-2
In reply to this post by ole ersoy
Le 23/09/2015 19:20, Ole Ersoy a écrit :
> HI Luc,

Hi Ole,

>
> On 09/23/2015 03:02 AM, luc wrote:
>> Hi,
>>
>> Le 2015-09-22 02:55, Ole Ersoy a écrit :
>>> Hola,
>>>
>>> On 09/21/2015 04:15 PM, Gilles wrote:
>>>> Hi.
>>>>
>>>> On Sun, 20 Sep 2015 15:04:08 -0500, Ole Ersoy wrote:
>>>>> On 09/20/2015 05:51 AM, Gilles wrote:
>>>>>> On Sun, 20 Sep 2015 01:12:49 -0500, Ole Ersoy wrote:
>>>>>>> Wanted to float some ideas for the LeastSquaresOptimizer (Possibly
>>>>>>> General Optimizer) design.  For example with the
>>>>>>> LevenbergMarquardtOptimizer we would do:
>>>>>>> `LevenbergMarquardtOptimizer.optimize(OptimizationContext c);`
>>>>>>>
>>>>>>> Rough optimize() outline:
>>>>>>> public static void optimise() {
>>>>>>> //perform the optimization
>>>>>>> //If successful
>>>>>>>     c.notify(LevenberMarquardtResultsEnum.SUCCESS, solution);
>>>>>>> //If not successful
>>>>>>>
>>>>>>>
>>>>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_COST_RELATIVE_TOLERANCE,
>>>>>>>
>>>>>>> diagnostic);
>>>>>>> //or
>>>>>>>
>>>>>>>
>>>>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_PARAMETERS_RELATIVE_TOLERANCE,
>>>>>>>
>>>>>>> diagnostic)
>>>>>>> //etc
>>>>>>> }
>>>>>>>
>>>>>>> The diagnostic, when turned on, will contain a trace of the last N
>>>>>>> iterations leading up to the failure.  When turned off, the
>>>>>>> Diagnostic
>>>>>>> instance only contains the parameters used to detect failure. The
>>>>>>> diagnostic could be viewed as an indirect way to log optimizer
>>>>>>> iterations.
>>>>>>>
>>>>>>> WDYT?
>>>>>>
>>>>>> I'm wary of having several different ways to convey information to
>>>>>> the
>>>>>> caller.
>>>>> It would just be one way.
>>>>
>>>> One way for optimizer, one way for solvers, one way for ...
>>>
>>> Yes I see what you mean, but I think on a whole it will be worth it to
>>> add additional sugar code that removes the need for exceptions.
>>>
>>>>
>>>>> But the caller may not be the receiver
>>>>> (It could be).  The receiver would be an observer attached to the
>>>>> OptimizationContext that implements an interface allowing it to
>>>>> observe
>>>>> the optimization.
>>>>
>>>> I'm afraid that it will add to the questions of what to put in the
>>>> code and how.  [We already had sometimes heated discussions just for
>>>> the IMHO obvious (e.g. code formatting, documentation, exception...).]
>>>
>>> Hehe.  Yes I remember some of these discussions.  I wonder how much
>>> time was spent debating the exceptions alone?  Surely everyone must
>>> have had this feeling in pit of their stomach that there's got to be a
>>> better way.  On the exception topic, these are some of the issues:
>>>
>>> I18N
>>> ===================
>>> If you are new to commons math and thinking about designing a commons
>>> math compatible exception you should probably understand the I18N
>>> stuff that's bound to exception (and wonder why it's bound the the
>>> exception).
>>
>> Not really true.
> Well a lot of things are gray.  Personally if I'm dealing with an API, I
> like to understand it, so that there are no surprises.  And I understand
> that the I18N coupling might not force me to use it, but if I want to be
> smart about my architecture, and simplify my design, then I should look
> at it.  Maybe it is a good idea.  Maybe I should just gloss over it?  Am
> I being sloppy if I just gloss over it?
>
> Or is there an alternative that provides the same functionality, or
> maybe something better, that does not come with any of these side effects?
>
>> The I18N was really simple at start.
>
> Yup I reviewed it and thought - it's probably no big deal - but as I
> started looking into reusing the CM exceptions, I decided that it was
> not worth the complexity.  I think that if I throw and exception it is
> my responsibility to make sure that whomever receives it has a very
> simple 0 minute or minimal resolution time for dealing with it, so that
> when the code is handed over the client feels confident.  It should work
> like a fridge.  We leave the door open too long, a bell goes off, we
> close the door.
>
>> See the one from
>> the Orekit project <https://www.orekit.org/forge/projects/orekit/>
>> which is
>> still in this state, and you will see adding a new message can be done
>> really
>> simply without a lot of stuff.
>
> Simple is a relative term.  But really simple is different from simple.
> Really simple would be the LevenbergMarquardtOptimizer in it's own
> module, separated from everything else, with a minimal set of
> dependencies.  If it's not this simple, then it quickly grows more
> complex as we scale.
>
>> I18N here is basically one method (getLocalizedString)
>> that is never changed in one enumerate class (OrekitMessages), and
>> adding a message
>> is only adding one entry to the enumerate, that's all.
> Sure and when a developer sees that one method, and they know how sharp
> the Apache people are (Sincerely), they start looking into reusing this
> design.  So if we are going to send them down that path, then that
> should be the best path.

CM is not intended to be a design pattern people should mimic. We are so
bad at this it would be a shame. No one in its right mind would copy
or reuse this stuff. It is for internal use only and we don't even have
the resources to manage it by ourselves so we can't consider it as a
path people should follow as we are leading them. Here we would be
leading them directly against the wall.

>
>>
>> The huge pile we have now is only partly related to I18N.
> That's what I mean by scaling.  It's like the princess with the pea
> below the bottom mattress.  She keeps throwing mattresses on top, and
> the pea is still bothering her.  So after the 10th mattress, she decides
> they all have to come down until she can find out what's going on.  All
> 10 mattresses have to come back off.
>
>> The context stuff
>> was introduced as an attempt to solve some programmatic retrieval of
>> information
>> at catch time, not related to 18N.
> A callback would:
> - Provide the same context through a diagnostic
> - Decouple from I18N, but still provide the means to retrieve a message.
> - Be more tightly coupled to the root cause of the exception / error

Yes, but it would require user to take care of the return context and
to call it or to pass it above, for every single call. Exception
removes intermediate code, callback forces you to handle it by
yourself at all call levels, even where you don't want to handle it
but simply pass it above as the error management code lies.

And this is not only about bubbling the exception out as in providing
the data at upper level, which I agree could be done by some global
handler. It is *interrupting* the current course of operation that is
the tricky part. If you need to interrupt the run as fast as possible
when error is detected, either you let the JVM do it for you with
exception or you put if statements at each call site (either before the
call if you consider users should validate everything or after the
call if you don't). By the way, validating input is also not always
possible. Of course for the simple zero vector problem I used as an
example it is easy, but for slightly different cases it is not.
Predicting beforehand that input will generate an error, is the same
as the halting problem, which is undecidable (at least for Turing
machines). If you want, replace "create a rotation" by "solve a
quadratic equation". There, checking that there are no solution is
done by computing the discriminant, which is really what the solver
would do (if we had a quadratic solver in CM). Things go worth with
more complex algorithms, and at the end it appears that a blanket
statement saying "users should pre-validate input, thus ensuring
no error can occur" is not sustainable.

>
>> The huge hierarchy was introduced to
>> go in a direction were one exception = one type.
> That's what I was hoping would lead to `one type` equalling the precise
> root cause of the exception, but this is not true.  An exception can be
> reused in multiple places with multiple different root causes.
>
>> The ArgUtils was introduced
>> due to some Serialization issues. None of this is 18N.
> So if we look at this through the lens of:
> - Core developer productivity
> - API user productivity
>
> Are these helping?

No, they are not. Or rather they are not for me, but maybe other
people rely on them, I don't know. This is why I consider we are
wrong with our current design.

>
>>
>> Yes our exception is crap. No I18N is not the only responsible.
> Honestly until I started reading NodeJS code (About six months ago) I
> was super happy with it, and thought that it was this way, because it's
> the only way, so that's what we get.  But now I think we can do much
> better.
>
>>
>>> Grab a coffee and spend a few hours, unless you are
>>> obviously fairly new to Java like some ofthe people posting for help.
>>> In this case when the exception occurs, there is going to be a lot of
>>> tutoring going on on the users list.
>>>
>>> Number of Exceptions
>>> ===================
>>> Before you do actually design a new exception, you should probably see
>>> if there is an exception that already fits the category of what you
>>
>> I don't agree.
> So you don't think someone should look to reuse an existing
> exception...because that would lead to a lot more new exceptions?

What I mean is that adding messages is not wrong by itself. They can
help users. Ensuring we never duplicate one message takes a lot of
developers time, which we don't have, and it does not help users.

>
>> The large list in the enumerate is only a list of
>> messages for user display.
> The exceptions are one thing and user display is another, and should we
> be mixing these?  On the one hand we are saying the CM is a low level
> library and that the developer should catch the exceptions, and instruct
> the client on how to handle them, and now we are saying that there are
> message for user display...

No, you are missing the consequences of the "low level" argument. Large
applications can use CM at low level, with some less low level code
called by intermediate code called by slighty higher level code, itself
used by high level code and so on. I don't say direct developers should
catch exception directly above CM, as exceptions should be really rare
and correspond to ... exceptional situations. This is often handle quite
high in the hierarchy for large applications that process large data and
should run really fast (I'll put a real life example below). So yes
exceptions are caught, but not immediately by the caller, it may be far
above and could even be at only one place in some cases, not at
thousands of call sites. At this place, yes, sometimes we simply display
the error because we don't know what to do. In some other cases, we
try to circumvent the problem not by dissecting the exception (so we
often don't even look at the embedded Exceptioncontext) but by switching
to something different. With exceptions, the intermediate level, which
is very important is much simpler. If you don't know how to react to
a zero vector at one place because you don't know why this vector is
zero, you just don't care, the exception will be automatically forwarded
above by the JVM and you will not even see it. You don't have to check
for the error just to interrupt yourself the program flow and
short-circuit to return immediately. The JVM does it for you. You care about
the exception only at two places : where it is thrown (in CM) and where
the users know what to do (displaying, stopping or chosing an
alternative in the rare cases it is possible). There is no dedicated
code in between.


>
> And this is sort of a gray area, so I don't mean to nit pick it so much,
> but there's is a cleaner way to do this, that does not mix the user
> display concept with the exception / reporting of something different
> than a solution.

A cleaner way without forcing intermediate code to care about an
error it doesn't know what to do with? Even if the intermediate
code can only do "if (context.hasError()) return;" it is not clean
when multiplied by thousands of occurrences.

>
>> There would really be nothing wrong to add
>> even more messages,even if they are close to existing ones.
> What if they are exactly the same as the existing ones...because we were
> too lazy to scan?  And there's a simple programmatic solution to that,
> but do we really want this workflow?

Then we will have duplicates. It will cost a few bytes in memory. It
will save hours of scarce CM developers time.

>
>> In fact, I
>> even think we should avoid trying to reuse messages by merging them all
>> in something not meaningful to users (remember, messages are for display
>> only). Currently we too often get a message like:
>>
>>   Number is too large, 3 > 2
>>
>> Nobody understands this at user level.
> If instead there's an Enum, tied to the corresponding class, that
> represents this condition at the line where it happens, then we can
> leave it up to the developer to craft a message that will serve the
> client best.

Sure. By the way, this is what we have: an enumerate called
LocalizedMessages. Call it MessageIdentifier if you prefer and remove
the formatting, it would be OK too. The only thing is that in addition
to th enumerate we also need some variable parts (like the numbers
involved, the dimensions and so on). So this is why we have the object[]
parts too. At the beginning this was all we had and it was
simple. Then we lost our minds and created a monstrous beast noone
can tame.

>
> And I'm not saying that there should not be messages.  Just that they
> should be isolated to a specific context, maintained within the
> parameters of that context, and looked up by a single unique key only.


Look at OrekitException, there is one OrekitMessages enumerate and one
Object[] parts. The Exceptioncontext is there too but only for
compatibility with CM when some OrekitExceptions link back to CM
exceptions. I would be happy to drop it.

And it if is not clear enough since the time, yes you could put the
I18N elsewhere if you want, as long as you have this two pieces of
information, the enumerate and the parts available. I think it could
remain there because it is simple, stable and does need attention
from developers, but if you insist I18N is evil, just drop it. Don't
throw the baby out with the bathwater.

>
>
> The reused message has lost its
>> signification. I would really prefer different messages for different
>> cases
>> where the fixed part of the format would at least provide some hint to
>> the
>> user.
>
> Yes I think we are saying the same thing here.

Fine.

>
>>
>>
>>> are doing.  So you start reading. Exception1...nop
>>> Exception2...nop...Exception3...Exception999..But I think I'm getting
>>> warmer.  OK - Did not find it ... but I'm fairly certain that there is
>>> a elegant place for it somewhere in the exception hierarchy...
>>
>> I agree. The exception hierarchy is a mess.
>
> And in all fairness it seemed really elegant at first, and we all went
> with it.
>>
>>>
>>>
>>> Handling of Exceptions
>>> ===================
>>> If our app uses several of the commons math classes (That throw
>>> exceptions of the same type), and one of those classes throws an
>>> exception,what is the app supposed to do?
>>
>> It depends on the application.
>
> But it's possible to architect CM so that the case is that the developer
> is led down one road only.  And it's like a Hyperloop. They get in, and
> 7 minutes later, they are sipping a Mojito in Cancun :).
>
>> Apache Commons Math is a low level
>> library it is used in many different contexts and there is no single
>> answer.
>
> Unless there is.

An answer that requires all intermediate level to pass the error
above is not an answer to me.

>>
>>
>>>
>>> I think most developers would find that question somewhat challenging.
>>>  There are numerous strategies.  Catch all exceptions and log what
>>> happened, etc.  But what if the requirement is that if an exception is
>>> thrown, the organization that receives it has 0 seconds to get to the
>>> root cause of it and understand the dynamics. Is this doable? (Yes
>>> obviously, but how hard is it...?).
>>
>> It is not Apache Commons Math level of decision.
> Not right now, but it could be.

We are not even capable of doing our own regular math job. Lets do
it before attempting to develop the desing pattern of the next century.

>
>> We provide the exception
>> and users can catch it fast. What users do with it is up to them.
>
> Sure.  That's how it works right now, but this is causing both more CM
> core developer overhead and API user overhead than there would be if
> exceptions are replaced with Enums and throwing exceptions is replaced
> with a callback.

A method is developed once in CM and called thousands time in users
code (several different users by the way). So we trade the development
time for three lines of code on our sides with the development time
of a few thousands lines of code on our users. It's not nice.

>
>>
>>
>>>
>>>
>>>>>> It seems that the reporting interfaces could quickly overwhelm
>>>>>> the "actual" code (one type of context per algorithm).
>>>>> There would one type of Observer interface per algorithm. It would
>>>>> act on the solution and what are currently exceptions, although these
>>>>> would be translated into enums.
>>>>
>>>> Unless I'm mistaken, the most common use-case for codes implemented
>>>> in a library such as CM is to provide a correct answer or bail out
>>>> in a non-equivocal way.
>>> Most java developers are used to synchronous coding...call the method
>>> get the response...catch the exception if needed.  This is changing
>>> with JDK8, and as we evolve and start using lambdas, we become more
>>> accustomed to the functional callback style of programming.
>>> Personally I want to be able to use an API that gives me what I need
>>> when everything works as expected, allows me to resolve unexpected
>>> issues with minimal effort, and is as simple, fluid, and lightweight
>>> as possible.
>>>
>>>>
>>>> It would make the code more involved to handle a minority of
>>>> (undefined) cases. [Actual examples would be welcome in order to
>>>> focus the discussion.]
>>>
>>> Rough Outline (I've evolved the concept and moved away from the
>>> OptimizationContext in the process of writing):
>>>
>>> interface LevenbergMarquardtObserver {
>>>
>>>     public void hola(Solution s);
>>>     public void sugarHoneyIceTea(ResultType rt, Dianostics d);
>>> }
>>>
>>> public class LMObserver implements LevenbergMarquardtObserver {
>>>
>>>    private Application application;
>>>
>>>    public LMObserver(Application application) {
>>>        this.application = application;
>>>    }
>>>
>>>    public void hola(ResultType rt, Solution s) {
>>>                 application.next(solution);
>>>    }
>>>
>>>    public void sugarHoneyIceTea(ResultType rt, Diagnostic s)
>>>        if (rt == ResultType.I_GOT_THIS_ONE) {
>>>             //I looked at the commons unit tests for this algorithm
>>> evaluating
>>>             //the diagnostics that shows how this failure can occur
>>>             //I'm totally fixing this!  Steps aside!
>>>        }
>>>        else if (rt == ResultType.REALLY_COMPLICATED_STUFF)
>>>        {
>>>            //We need our best engineers...call India.
>>>        }
>>>   )
>>>
>>>
>>> public class Application {
>>>     //Note nothing is returned.
>>>     LevenberMarquardtOptimizer.setOberver(new
>>> LMObserver(this)).setLeastSquaresProblem(new
>>> ClassThatImplementsTheProblem())).start();
>>>
>>>     public void next(Solution solution) {
>>>
>>>         //Do cool stuff.
>>>
>>>     }
>>> }
>>>
>>> Or an asynchronous variation:
>>>
>>> public class Application {
>>> //This call will not block because async is true
>>>     LevenberMarquardtOptimizer.setAsync(true).setOberver(new
>>> LMObserver()).setLeastSquaresProblem(new
>>> ClassThatImplementsTheProblem())).start();
>>>
>>>     //Do more stuff right away.
>>>
>>>     public void next(Solution solution) {
>>>         //When the thread running the optimization is done, this
>>> method is called back.
>>>         //Do whatever comes next
>>>     }
>>> }
>>>
>>> The above would start the optimization in a separate thread that does
>>> not / SHOULD NOT share data with the main thread.
>>>
>>>>
>>>>>> The current reporting is based on exceptions, and assumes that if no
>>>>>> exception was thrown, then the user's request completed successfully.
>>>>> Sure - personally I'd much rather deal with something similar to an
>>>>> HTTP status code in a callback, than an exception .  I think the code
>>>>> is cleaner and the calback makes it more elegant to apply an adaptive
>>>>> approach to handling the response, like slightly relaxing constraints,
>>>>> convergence parameters, etc.  Also by getting rid of the exceptions,
>>>>> we no longer depend on the I18N layer that they are tied to and now
>>>>> the messages can be more informative, since they target the root
>>>>> cause.  The observer can also run in the 'main' thread' while the
>>>>> optimization can run asynchronously.  Also WRT JDK9 and modules,
>>>>> loosing the exceptions would mean one less dependency when the library
>>>>> is up into JDK9 modules...which would be more in line with this
>>>>> philosophy:
>>>>> https://github.com/substack/browserify-handbook#module-philosophy
>>>>
>>>> I'm not sure I fully understood the philosophy from the text in this
>>>> short paragraph.
>>>> But I do not agree with the idea that the possibility to quickly find
>>>> some code is more important than standards and best practices.
>>>
>>> If you go to npmjs.org and type in Neural Network you will get 56
>>> results all linked to github repositories.
>>>
>>> In addition there's meta data indicating number of downloads in the
>>> last day, last month, etc.  Try typing in cosine.  Odds are you will
>>> find a package that does just want you want and nothing else. This is
>>> very underwhelming and refreshing in terms of cloning off of github
>>> and getting familar with tests etc.  Also eye opening.  How many of us
>>> knew that we could do that much stuff with cosine! :).
>>>
>>>>
>>>>>> I totally agree that in some circumstances, more information on the
>>>>>> inner working of an algorithm would be quite useful.
>>>>> ... Algorithm iterations become unit testable.
>>>>>>
>>>>>> But I don't see the point in devoting resources to reinvent the
>>>>>> wheel:
>>>>> You mean pimping the wheel?  Big pimpin.
>>>>
>>>> I think that logging statements are easy to add, not disruptive at all,
>>>> and come in handy to understand a code's unexpected behaviour.
>>>> Assuming that a "logging" feature is useful, it can be added *now*
>>>> using
>>>> a dependency towards a weight-less (!) framework such as "slf4j".
>>>> IMO, it would be a waste of time to implement a new communication layer
>>>> that can do that, and more, if it would be used for logging only in 99%
>>>> of the cases.
>>> SLF4J is used by almost every other framework, so why not use it?
>>> Logging and the diagnostic could be used together.  The primary
>>> purpose of the diagnostic though is to collect data that will be
>>> useful in `sugarHoneyIceTea`.
>>>
>>>>
>>>>>>
>>>>>> I longed several times for the use of a logging library.
>>>>>> The only show-stopper has been the informal "no-dependency" policy...
>>>>> JDK9 Jigsaw should solve dependency hell, so the less coupling
>>>>> between commons math classes the better.
>>>>
>>>> I wouldn't call "coupling" the dependency towards exception classes:
>>>> they are little utilities that can make sense in various parts of the
>>>> library.
>>>
>>> If for example the Simplex solver is broken off into it's own module,
>>> then it has to be coupled to the exceptions, unless it is exception
>>> free.
>>>
>>>>
>>>> [Unless one wants to embark on yet another discussion about exceptions;
>>>> whether there should be one class for each of the "messages" that exist
>>>> in "LocalizedFormats"; whether localization should be done in CM;
>>>> etc.]
>>>
>>> I think it would be best to just eliminate the exceptions.
>>
>> NO! A big no!
> Before the NO is that big, how about examining just a few simple cases
> where it's done?  You might come around to liking it.  Also it does not
> need to be a all or nothing.  Some modules could be exception free,
> whereas others, could throw exceptions because it might be the simplest
> thing to do and what people are used to.
>>
>> Apache Commons Math is a very low level library. There are use cases
>> where you have huge code that relies on math almost everywhere.
> True dat. But it's quite possible that some of the math that is being
> used does not actually need to throw the exception.  There are multiple
> exception categories:
>
> - Exceptions thrown by invalid input parameters
> - Exceptions that signal that either something completely unexpected
> happened (Even we are like WTF?)
> - Exceptions that could happen in certain edge conditions that we
> understand and know how to deal with
>
> The first one we can eliminate by supplying a self validating object to
> the routine.  The next two are the ones that it might really be helpful
> they are communicated as Enums via a callback.
>
> It could be reviewed on a case by case basis, in the a process that
> considers moving to JDK9 modules.
>
>> Look at
>> the Orekit library for example. We use Vector3D everywhere, we use
>> Rotation
>> everywhere, we use ode in many places, we use some linear algebra, we use
>> optimizers, we use a few statistics, we use root solvers, we use
>> derivatives,
>> we use BSP
>
> No problem just get rid of the 3D rotation, kill the Vector3D, forget
> statistics (Way too complicated!), and it's all good :). Kidding
> obviously, but this does highlight what I'm talking about. If there are
> tons of CM classes being utilities within a single method, and all of
> them can throw generic exceptions that cut across multiple classes, the
> developer has to decode the exception, causing her to waste time.
>
>
>> ... Some of these uses looks as large scale call/return pattern
>> where you could ask users to check the result afterwards, but many,
>> many of
>> the calls are much lower levels and you have literally several
>> thousands of
>> calls to math.
> That's all fine.  CM will still have the exact same workings - it will
> just be more efficient.
>
>> Just think about forcing user to check a vector can be normalized
>> (i.e. has not 0 norm) everywhere a vector is used. We would end up with
>> something like:
>
>>
>>   Rotation r = new Rotation(a, alpha);
>>   if (!r.isValid()) {
>>     // a vector was null
>>     return error;
>>  }
>
> In this case the client developer could have made sure the vector was
> not null and valid prior to passing it to a routine.  It's possible that
> CM will be more lightweight, efficient, and friendly if API users are
> told that it's their responsibility to validate parameters.  So:
>
> RotationContext r = new RotationContext(a, alpha);
> if (!r.isValid()) {
>     // Dag Nab It!!!
> }
> else {
>    //Rotate away
>    Rotation rotation = new Rotation(rotationContext);
> }
>
> Or if we know that the parameters are valid (Mass validation):
> Rotation rotation = new Rotation(a, alpha);
>>
>> Repeat the above 4325 times in your code ... welcome back to the 80's.
>
> We only have to repeat the validation check 4325 times if there is
> actually a possibility that the vector is invalid.

You didn't get what I meant. Exceptions are for, well exceptional cases.
These cases are often not predictable and are buried in highly
complex algorithms. So you cannot know for sure you will not have a
zero vector there. This vector itself is computed from another complex
algorithm, itself fed by other complex algorithms and so on. There are
no clear lines : here vectors are known to be valid and here they are
unknown and should be checked by users beforehand because CM will never
tell the user something bad occurs when it occurs. Shit happens and
we need to help users, not tell them " we have warned you, you are
a bad guy because you did not check your data, but we are good guys
despite we did not tell you when we saw the bad data, your fault".

You cannot ask users to cripple there code with neither a posteriori
error checking nor a priori validation at *all* call sites for
exceptional conditions that occur once every million call. And yes, one
problem every million call is a big problem when you handle huge data.
A recent project I worked on was link to the Sentinel 2 Earth
observation satellite. It generates terabytes of data and every single
pixel computation needs a lot of geometric transforms (and rotations
and others). For now we have:

 Rotation r = new Rotation(a, alpha)

or similar things for dot products, interpolators, solvers, optimizers
and so on in highly critical code that needs to run very very fast.
Somewhere above (really lots of layers above), we catch exceptions when
they occur, very rarely (say once every few tens millions pixels, i.e.
a few times each second or minute) and simply try another much slower
algorithm for this single pixel, then come back to the regular fast
algorithm for the next few hundreds millions pixels that are waiting to
be processed. Basically exceptions allow us to break out of the deep
layers of algorithms immediately. Continuous pre or post checkings at
every call sites is impractical and would be much slower. This is akin
to malloc/free versus garbage collecting.


So after all this long mail is centered about one idea:

 Exceptions allow automatic code rerouting without user handling
 at lots and lots of intermediate levels. Callbacks (or pre-validation
 which is not always possible) force user handling at all intermediate
 levels.

best regards,
Luc

>
>>
>> Exceptions are meant for, well, exceptional situations.
> These are exceptional times :)
>> They avoid people
>> to cripple the calling code with if(error) statements.
> We need if(error) at some point.  I agree that if(error) should be
> minimized.
>
>> They ensure (and this
>> is the most important part) that as soon as the error is detected at very
>> low level it will be identified and reported back to the upper level
>> where
>> it is caught and displayed, without forcing *all* the intermediate levels
>> to handle it.
>
> Yes but this causes the "Root cause analysis deciphering effect" I've
> been talking about.
>
>> When you have complex code with complex algorithms and several
>> nested levels of calls in different libraries, maintained by different
>> teams
>> and usings tens of thousands calls to math, you don't want a
>> call/return/check error
>> type of programming like we used to do in the 80's.
> Ahh the Commodore 64 days...I miss those days...Anyways - no one wants
> to be hung up on some clown from the 80s.  Right now we are hung up on a
> clown from the 90s though!  We need a Kim Kardashian API. :)
>
> Cheers,
> - Ole
>
>
>>
>> best regards,
>> Luc
>>
>>>
>>>>
>>>>> Anyways I'm obviously
>>>>> interested in playing with this stuff, so when I get something up into
>>>>> a repository I'll to do a callback :).
>>>>
>>>> If you are interested in big overhauls, there is one that gathered
>>>> relative consensus: rewrite the algorithms in a "multithread-friendly"
>>>> way.
>>> I think that's a tall order that will take us into JDK88 :). But
>>> using callbacks and making potentially long running computations
>>> asynchronous could be a middle ground that would allow simple multi
>>> threaded use without fiddling around under the hood...
>>>
>>>>
>>>> Some ideas were floated (cf. ML archive) but no implementation or
>>>> experiment...  Perhaps with a well-defined goal such as performance
>>>> improvement, your design suggestions will become clearer to more
>>>> people.
>>>>
>>>> AFAIK, only the classes in the "o.a.c.m.neuralnet" package are
>>>> currently
>>>> ready to be used with the "java.util.concurrent" framework.
>>> FWIU Neural Nets are a great fit for concurrency.  I think for the
>>> others we will end up having discussions around how users would
>>> control the number of threads, etc. again that makes some of us
>>> nervous.  An asynchronous operation that runs in one separate thread
>>> is easier to reason about.  If we want to test 10 neural net
>>> configurations, and we have 10 cores, then we can start each by itself
>>> by doing something like:
>>>
>>> Nework.setAsync(true).addNeurons().connectNeurons().addObserver(observer).start().
>>>
>>> //Now do 10 more
>>> //If the observer is shared then notifications should be thread safe.
>>>
>>> Cheers,
>>> - Ole
>>>
>>> P.S. Dang that was a long email.  If I write one more of these, ban
>>> me :)
>>>
>>>>
>>>>
>>>> Best regards,
>>>> Gilles
>>>>
>>>>>
>>>>> Cheers,
>>>>> Ole
>>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: [hidden email]
>>>> For additional commands, e-mail: [hidden email]
>>>>
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] LeastSquaresOptimizer Design

Gilles Sadowski
> [...]
>
> CM is not intended to be a design pattern people should mimic.
> We are so bad at this

The crux is that the project's team is in effect not _interested_
in this.  [And I admit that I had not understood it for a long
time (hence the temptation to convince that it was important for
*some* people).]

> it would be a shame. No one in its right mind would copy
> or reuse this stuff. It is for internal use only

Then why is it so difficult to change (cf. all the nit-picking
about backward-compatibility)?
As was (relatively) recently discussed, we could "mark" some code
"for internal use" and be free to break compatibility at any time,
for the sake of (an attempt at) a better design.

> and we don't even have
> the resources to manage it by ourselves

There are (maybe) other people (like Ole?) who would like to
experiment with new design ideas (not new math algorithms!)
but are repelled by the (overly) conservative development process
which is mainly feature-driven (like in a commercial project,
shall I dare to say).

> so we can't consider it as a
> path people should follow as we are leading them. Here we would be
> leading them directly against the wall.

True, unfortunately.
There is really no long-term design. Even short term (quasi-)decisions
when they concern the library as a whole, are not followed by action
(cf. "fluent API")...

> [...]

Best,
Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] LeastSquaresOptimizer Design

ole ersoy
In reply to this post by Luc Maisonobe-2
On 09/23/2015 03:09 PM, Luc Maisonobe wrote:
> CM is not intended to be a design pattern people should mimic. We are so bad at this it would be a shame. No one in its right mind would copy or reuse this stuff. It is for internal use only and we don't even have the resources to manage it by ourselves so we can't consider it as a path people should follow as we are leading them. Here we would be leading them directly against the wall.

Hehe - I think that's like Michael Jordan saying - "Guys, don't try to be like me.  I just play a little ball.  Dunk from the free throw line.  Six world championships, but THATs it!".  In any case, I really appreciate you and Gilles taking the time to talk.  Luc (And possibly Gilles) - I can actually see why you are getting a bit annoyed, because I'm ignoring something important.

I've been doing 90% NodeJS stuff lately (Which is event loop based and relies callbacks) so I forgot one very important thing that I think you have both tried to tell me.  The exception undoes the current callstack / breaks the current program flow, bubbling up to the handler.  Thaaaats a good point.

OK - So scratch the callback thinking for synchronous code.  The Lombok stuff should still be good though and hopefully some of the callback discussion around and asynchronous option - I hope!  Geez.

What do you think about having one exception per class with an Enum that encodes the various types of exceptional conditions that the class can find itself in?  So in the case of LevenbergMarquardtOptimizer there would be a:
- LevenbergMarquardtOptimizerException:
- LevenbergMarquardtOptimizerExceptionEnum

When the exception is thrown it sets the Enum indicating the root cause.  The enum can then be used as a key to lookup the corresponding message.

Any better?

Cheers,
- Ole


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] LeastSquaresOptimizer Design

Luc Maisonobe-2
In reply to this post by Gilles Sadowski
Hi Gilles,

Le 2015-09-23 23:00, Gilles a écrit :

>> [...]
>>
>> CM is not intended to be a design pattern people should mimic.
>> We are so bad at this
>
> The crux is that the project's team is in effect not _interested_
> in this.  [And I admit that I had not understood it for a long
> time (hence the temptation to convince that it was important for
> *some* people).]
>
>> it would be a shame. No one in its right mind would copy
>> or reuse this stuff. It is for internal use only
>
> Then why is it so difficult to change (cf. all the nit-picking
> about backward-compatibility)?
> As was (relatively) recently discussed, we could "mark" some code
> "for internal use" and be free to break compatibility at any time,
> for the sake of (an attempt at) a better design.

I think it would be nice. IMHO, we are overzealous on compatibility.
Sometimes, I try to introduce some changes that may break compatibility
early for some non user-implementable interfaces but have to withdraw
my proposal (tried it for ode, tried ot for BSP trees, think I tried it
for optimizers).

>
>> and we don't even have
>> the resources to manage it by ourselves
>
> There are (maybe) other people (like Ole?) who would like to
> experiment with new design ideas (not new math algorithms!)
> but are repelled by the (overly) conservative development process
> which is mainly feature-driven (like in a commercial project,
> shall I dare to say).

Surely. But it is not specifically commercial vs open source I think.
I know commercial projects that break compatibility all the time
(perhaps
to force users buying a new licence) and some that are stable. I know
open-source projects that break compatibility all the time (perhaps
because the team is highly dynamic or doesn't care about users) and
some that are stable.

>
>> so we can't consider it as a
>> path people should follow as we are leading them. Here we would be
>> leading them directly against the wall.
>
> True, unfortunately.
> There is really no long-term design. Even short term (quasi-)decisions
> when they concern the library as a whole, are not followed by action
> (cf. "fluent API")...

I still have fluent API in my TODO list and still really wants to make
it appear. The major blocking factor is available time (and sometimes
also despair about probable endless forthcoming discussions but it is
only second in the list).

best regards,
Luc

>
>> [...]
>
> Best,
> Gilles
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] LeastSquaresOptimizer Design

Luc Maisonobe-2
In reply to this post by ole ersoy
Le 2015-09-24 04:16, Ole Ersoy a écrit :

> On 09/23/2015 03:09 PM, Luc Maisonobe wrote:
>> CM is not intended to be a design pattern people should mimic. We are
>> so bad at this it would be a shame. No one in its right mind would
>> copy or reuse this stuff. It is for internal use only and we don't
>> even have the resources to manage it by ourselves so we can't consider
>> it as a path people should follow as we are leading them. Here we
>> would be leading them directly against the wall.
>
> Hehe - I think that's like Michael Jordan saying - "Guys, don't try to
> be like me.  I just play a little ball.  Dunk from the free throw
> line.  Six world championships, but THATs it!".  In any case, I really
> appreciate you and Gilles taking the time to talk.  Luc (And possibly
> Gilles) - I can actually see why you are getting a bit annoyed,
> because I'm ignoring something important.
>
> I've been doing 90% NodeJS stuff lately (Which is event loop based and
> relies callbacks) so I forgot one very important thing that I think
> you have both tried to tell me.  The exception undoes the current
> callstack / breaks the current program flow, bubbling up to the
> handler.  Thaaaats a good point.
>
> OK - So scratch the callback thinking for synchronous code.  The
> Lombok stuff should still be good though and hopefully some of the
> callback discussion around and asynchronous option - I hope!  Geez.
>
> What do you think about having one exception per class with an Enum
> that encodes the various types of exceptional conditions that the
> class can find itself in?  So in the case of
> LevenbergMarquardtOptimizer there would be a:
> - LevenbergMarquardtOptimizerException:
> - LevenbergMarquardtOptimizerExceptionEnum
>
> When the exception is thrown it sets the Enum indicating the root
> cause.  The enum can then be used as a key to lookup the corresponding
> message.
>
> Any better?

Sure. I would suggest adding some parameters to help the upper level
formatting
a meaningful message (say the number of iterations performed if you hit
a max
iteration, so users become aware they should have set the limit higher).
Nothing
over-engineered, a simple Object[] that can be used as last argument to
something
like String.format() would be enough.

best regards,
Luc

>
> Cheers,
> - Ole
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] LeastSquaresOptimizer Design

ole ersoy


On 09/24/2015 06:31 AM, luc wrote:

> Le 2015-09-24 04:16, Ole Ersoy a écrit :
>> On 09/23/2015 03:09 PM, Luc Maisonobe wrote:
>>> CM is not intended to be a design pattern people should mimic. We are so bad at this it would be a shame. No one in its right mind would copy or reuse this stuff. It is for internal use only and we don't even have the resources to manage it by ourselves so we can't consider it as a path people should follow as we are leading them. Here we would be leading them directly against the wall.
>>
>> Hehe - I think that's like Michael Jordan saying - "Guys, don't try to
>> be like me.  I just play a little ball.  Dunk from the free throw
>> line.  Six world championships, but THATs it!".  In any case, I really
>> appreciate you and Gilles taking the time to talk.  Luc (And possibly
>> Gilles) - I can actually see why you are getting a bit annoyed,
>> because I'm ignoring something important.
>>
>> I've been doing 90% NodeJS stuff lately (Which is event loop based and
>> relies callbacks) so I forgot one very important thing that I think
>> you have both tried to tell me.  The exception undoes the current
>> callstack / breaks the current program flow, bubbling up to the
>> handler.  Thaaaats a good point.
>>
>> OK - So scratch the callback thinking for synchronous code.  The
>> Lombok stuff should still be good though and hopefully some of the
>> callback discussion around and asynchronous option - I hope! Geez.
>>
>> What do you think about having one exception per class with an Enum
>> that encodes the various types of exceptional conditions that the
>> class can find itself in?  So in the case of
>> LevenbergMarquardtOptimizer there would be a:
>> - LevenbergMarquardtOptimizerException:
>> - LevenbergMarquardtOptimizerExceptionEnum
>>
>> When the exception is thrown it sets the Enum indicating the root
>> cause.  The enum can then be used as a key to lookup the corresponding
>> message.
>>
>> Any better?
>
> Sure. I would suggest adding some parameters to help the upper level formatting
> a meaningful message (say the number of iterations performed if you hit a max
> iteration, so users become aware they should have set the limit higher). Nothing
> over-engineered, a simple Object[] that can be used as last argument to something
> like String.format() would be enough.
Brilliant - I'll setup a repository and start experimenting.  Thanks again,
- Ole


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] LeastSquaresOptimizer Design

ole ersoy
In reply to this post by Luc Maisonobe-2
Hi Luc,

I gave this some more thought, and I think I may have tapped out to soon, even though you are absolutely right about what an exception does in terms bubbling execution to a point where it stops or we handle it.

Suppose we have an Optimizer and an Optimizer observer.  The optimizer will emit three different events given in the process of stepping through to the max number of iterations it is allotted:
- SOLUTION_FOUND
- COULD_NOT_CONVERGE_FOR_REASON_1
- COULD_NOT_CONVERGE_FOR_REASON_2
- END (Max iterations reached)

So we have the observer interface:

interface OptimizerObserver {

     success(Solution solution)
     update(Enum enum, Optimizer optimizer)
     end(Optimizer optimizer)
}

So if the Optimizer notifies the observer of `success`, then the observer does what it needs to with the results and moves on.  If the observer gets an `update` notification, that means that given the current [constraints, numbers of iterations, data] the optimizer cannot finish.  But the update method receives the optimizer, so it can adapt it, and tell it to continue or just trash it and try something completely different.  If the `END` event is reached then the Optimizer could not finish given the number of allotted iterations.  The Optimizer is passed back via the callback interface so the observer could allow more iterations if it wants to...perhaps based on some metric indicating how close the optimizer is to finding a solution.

What this could do is allow the implementation of the observer to throw the exception if 'All is lost!', in which case the Optimizer does not need an exception.  Totally understand that this may not work everywhere, but it seems like it could work in this case.

WDYT?

Cheers,
- Ole



On 09/23/2015 03:09 PM, Luc Maisonobe wrote:

> Le 23/09/2015 19:20, Ole Ersoy a écrit :
>> HI Luc,
> Hi Ole,
>
>> On 09/23/2015 03:02 AM, luc wrote:
>>> Hi,
>>>
>>> Le 2015-09-22 02:55, Ole Ersoy a écrit :
>>>> Hola,
>>>>
>>>> On 09/21/2015 04:15 PM, Gilles wrote:
>>>>> Hi.
>>>>>
>>>>> On Sun, 20 Sep 2015 15:04:08 -0500, Ole Ersoy wrote:
>>>>>> On 09/20/2015 05:51 AM, Gilles wrote:
>>>>>>> On Sun, 20 Sep 2015 01:12:49 -0500, Ole Ersoy wrote:
>>>>>>>> Wanted to float some ideas for the LeastSquaresOptimizer (Possibly
>>>>>>>> General Optimizer) design.  For example with the
>>>>>>>> LevenbergMarquardtOptimizer we would do:
>>>>>>>> `LevenbergMarquardtOptimizer.optimize(OptimizationContext c);`
>>>>>>>>
>>>>>>>> Rough optimize() outline:
>>>>>>>> public static void optimise() {
>>>>>>>> //perform the optimization
>>>>>>>> //If successful
>>>>>>>>      c.notify(LevenberMarquardtResultsEnum.SUCCESS, solution);
>>>>>>>> //If not successful
>>>>>>>>
>>>>>>>>
>>>>>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_COST_RELATIVE_TOLERANCE,
>>>>>>>>
>>>>>>>> diagnostic);
>>>>>>>> //or
>>>>>>>>
>>>>>>>>
>>>>>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_PARAMETERS_RELATIVE_TOLERANCE,
>>>>>>>>
>>>>>>>> diagnostic)
>>>>>>>> //etc
>>>>>>>> }
>>>>>>>>
>>>>>>>> The diagnostic, when turned on, will contain a trace of the last N
>>>>>>>> iterations leading up to the failure.  When turned off, the
>>>>>>>> Diagnostic
>>>>>>>> instance only contains the parameters used to detect failure. The
>>>>>>>> diagnostic could be viewed as an indirect way to log optimizer
>>>>>>>> iterations.
>>>>>>>>
>>>>>>>> WDYT?
>>>>>>> I'm wary of having several different ways to convey information to
>>>>>>> the
>>>>>>> caller.
>>>>>> It would just be one way.
>>>>> One way for optimizer, one way for solvers, one way for ...
>>>> Yes I see what you mean, but I think on a whole it will be worth it to
>>>> add additional sugar code that removes the need for exceptions.
>>>>
>>>>>> But the caller may not be the receiver
>>>>>> (It could be).  The receiver would be an observer attached to the
>>>>>> OptimizationContext that implements an interface allowing it to
>>>>>> observe
>>>>>> the optimization.
>>>>> I'm afraid that it will add to the questions of what to put in the
>>>>> code and how.  [We already had sometimes heated discussions just for
>>>>> the IMHO obvious (e.g. code formatting, documentation, exception...).]
>>>> Hehe.  Yes I remember some of these discussions.  I wonder how much
>>>> time was spent debating the exceptions alone?  Surely everyone must
>>>> have had this feeling in pit of their stomach that there's got to be a
>>>> better way.  On the exception topic, these are some of the issues:
>>>>
>>>> I18N
>>>> ===================
>>>> If you are new to commons math and thinking about designing a commons
>>>> math compatible exception you should probably understand the I18N
>>>> stuff that's bound to exception (and wonder why it's bound the the
>>>> exception).
>>> Not really true.
>> Well a lot of things are gray.  Personally if I'm dealing with an API, I
>> like to understand it, so that there are no surprises.  And I understand
>> that the I18N coupling might not force me to use it, but if I want to be
>> smart about my architecture, and simplify my design, then I should look
>> at it.  Maybe it is a good idea.  Maybe I should just gloss over it?  Am
>> I being sloppy if I just gloss over it?
>>
>> Or is there an alternative that provides the same functionality, or
>> maybe something better, that does not come with any of these side effects?
>>
>>> The I18N was really simple at start.
>> Yup I reviewed it and thought - it's probably no big deal - but as I
>> started looking into reusing the CM exceptions, I decided that it was
>> not worth the complexity.  I think that if I throw and exception it is
>> my responsibility to make sure that whomever receives it has a very
>> simple 0 minute or minimal resolution time for dealing with it, so that
>> when the code is handed over the client feels confident.  It should work
>> like a fridge.  We leave the door open too long, a bell goes off, we
>> close the door.
>>
>>> See the one from
>>> the Orekit project <https://www.orekit.org/forge/projects/orekit/>
>>> which is
>>> still in this state, and you will see adding a new message can be done
>>> really
>>> simply without a lot of stuff.
>> Simple is a relative term.  But really simple is different from simple.
>> Really simple would be the LevenbergMarquardtOptimizer in it's own
>> module, separated from everything else, with a minimal set of
>> dependencies.  If it's not this simple, then it quickly grows more
>> complex as we scale.
>>
>>> I18N here is basically one method (getLocalizedString)
>>> that is never changed in one enumerate class (OrekitMessages), and
>>> adding a message
>>> is only adding one entry to the enumerate, that's all.
>> Sure and when a developer sees that one method, and they know how sharp
>> the Apache people are (Sincerely), they start looking into reusing this
>> design.  So if we are going to send them down that path, then that
>> should be the best path.
> CM is not intended to be a design pattern people should mimic. We are so
> bad at this it would be a shame. No one in its right mind would copy
> or reuse this stuff. It is for internal use only and we don't even have
> the resources to manage it by ourselves so we can't consider it as a
> path people should follow as we are leading them. Here we would be
> leading them directly against the wall.
>
>>> The huge pile we have now is only partly related to I18N.
>> That's what I mean by scaling.  It's like the princess with the pea
>> below the bottom mattress.  She keeps throwing mattresses on top, and
>> the pea is still bothering her.  So after the 10th mattress, she decides
>> they all have to come down until she can find out what's going on.  All
>> 10 mattresses have to come back off.
>>
>>> The context stuff
>>> was introduced as an attempt to solve some programmatic retrieval of
>>> information
>>> at catch time, not related to 18N.
>> A callback would:
>> - Provide the same context through a diagnostic
>> - Decouple from I18N, but still provide the means to retrieve a message.
>> - Be more tightly coupled to the root cause of the exception / error
> Yes, but it would require user to take care of the return context and
> to call it or to pass it above, for every single call. Exception
> removes intermediate code, callback forces you to handle it by
> yourself at all call levels, even where you don't want to handle it
> but simply pass it above as the error management code lies.
>
> And this is not only about bubbling the exception out as in providing
> the data at upper level, which I agree could be done by some global
> handler. It is *interrupting* the current course of operation that is
> the tricky part. If you need to interrupt the run as fast as possible
> when error is detected, either you let the JVM do it for you with
> exception or you put if statements at each call site (either before the
> call if you consider users should validate everything or after the
> call if you don't). By the way, validating input is also not always
> possible. Of course for the simple zero vector problem I used as an
> example it is easy, but for slightly different cases it is not.
> Predicting beforehand that input will generate an error, is the same
> as the halting problem, which is undecidable (at least for Turing
> machines). If you want, replace "create a rotation" by "solve a
> quadratic equation". There, checking that there are no solution is
> done by computing the discriminant, which is really what the solver
> would do (if we had a quadratic solver in CM). Things go worth with
> more complex algorithms, and at the end it appears that a blanket
> statement saying "users should pre-validate input, thus ensuring
> no error can occur" is not sustainable.
>
>>> The huge hierarchy was introduced to
>>> go in a direction were one exception = one type.
>> That's what I was hoping would lead to `one type` equalling the precise
>> root cause of the exception, but this is not true.  An exception can be
>> reused in multiple places with multiple different root causes.
>>
>>> The ArgUtils was introduced
>>> due to some Serialization issues. None of this is 18N.
>> So if we look at this through the lens of:
>> - Core developer productivity
>> - API user productivity
>>
>> Are these helping?
> No, they are not. Or rather they are not for me, but maybe other
> people rely on them, I don't know. This is why I consider we are
> wrong with our current design.
>
>>> Yes our exception is crap. No I18N is not the only responsible.
>> Honestly until I started reading NodeJS code (About six months ago) I
>> was super happy with it, and thought that it was this way, because it's
>> the only way, so that's what we get.  But now I think we can do much
>> better.
>>
>>>> Grab a coffee and spend a few hours, unless you are
>>>> obviously fairly new to Java like some ofthe people posting for help.
>>>> In this case when the exception occurs, there is going to be a lot of
>>>> tutoring going on on the users list.
>>>>
>>>> Number of Exceptions
>>>> ===================
>>>> Before you do actually design a new exception, you should probably see
>>>> if there is an exception that already fits the category of what you
>>> I don't agree.
>> So you don't think someone should look to reuse an existing
>> exception...because that would lead to a lot more new exceptions?
> What I mean is that adding messages is not wrong by itself. They can
> help users. Ensuring we never duplicate one message takes a lot of
> developers time, which we don't have, and it does not help users.
>
>>> The large list in the enumerate is only a list of
>>> messages for user display.
>> The exceptions are one thing and user display is another, and should we
>> be mixing these?  On the one hand we are saying the CM is a low level
>> library and that the developer should catch the exceptions, and instruct
>> the client on how to handle them, and now we are saying that there are
>> message for user display...
> No, you are missing the consequences of the "low level" argument. Large
> applications can use CM at low level, with some less low level code
> called by intermediate code called by slighty higher level code, itself
> used by high level code and so on. I don't say direct developers should
> catch exception directly above CM, as exceptions should be really rare
> and correspond to ... exceptional situations. This is often handle quite
> high in the hierarchy for large applications that process large data and
> should run really fast (I'll put a real life example below). So yes
> exceptions are caught, but not immediately by the caller, it may be far
> above and could even be at only one place in some cases, not at
> thousands of call sites. At this place, yes, sometimes we simply display
> the error because we don't know what to do. In some other cases, we
> try to circumvent the problem not by dissecting the exception (so we
> often don't even look at the embedded Exceptioncontext) but by switching
> to something different. With exceptions, the intermediate level, which
> is very important is much simpler. If you don't know how to react to
> a zero vector at one place because you don't know why this vector is
> zero, you just don't care, the exception will be automatically forwarded
> above by the JVM and you will not even see it. You don't have to check
> for the error just to interrupt yourself the program flow and
> short-circuit to return immediately. The JVM does it for you. You care about
> the exception only at two places : where it is thrown (in CM) and where
> the users know what to do (displaying, stopping or chosing an
> alternative in the rare cases it is possible). There is no dedicated
> code in between.
>
>
>> And this is sort of a gray area, so I don't mean to nit pick it so much,
>> but there's is a cleaner way to do this, that does not mix the user
>> display concept with the exception / reporting of something different
>> than a solution.
> A cleaner way without forcing intermediate code to care about an
> error it doesn't know what to do with? Even if the intermediate
> code can only do "if (context.hasError()) return;" it is not clean
> when multiplied by thousands of occurrences.
>
>>> There would really be nothing wrong to add
>>> even more messages,even if they are close to existing ones.
>> What if they are exactly the same as the existing ones...because we were
>> too lazy to scan?  And there's a simple programmatic solution to that,
>> but do we really want this workflow?
> Then we will have duplicates. It will cost a few bytes in memory. It
> will save hours of scarce CM developers time.
>
>>> In fact, I
>>> even think we should avoid trying to reuse messages by merging them all
>>> in something not meaningful to users (remember, messages are for display
>>> only). Currently we too often get a message like:
>>>
>>>    Number is too large, 3 > 2
>>>
>>> Nobody understands this at user level.
>> If instead there's an Enum, tied to the corresponding class, that
>> represents this condition at the line where it happens, then we can
>> leave it up to the developer to craft a message that will serve the
>> client best.
> Sure. By the way, this is what we have: an enumerate called
> LocalizedMessages. Call it MessageIdentifier if you prefer and remove
> the formatting, it would be OK too. The only thing is that in addition
> to th enumerate we also need some variable parts (like the numbers
> involved, the dimensions and so on). So this is why we have the object[]
> parts too. At the beginning this was all we had and it was
> simple. Then we lost our minds and created a monstrous beast noone
> can tame.
>
>> And I'm not saying that there should not be messages.  Just that they
>> should be isolated to a specific context, maintained within the
>> parameters of that context, and looked up by a single unique key only.
>
> Look at OrekitException, there is one OrekitMessages enumerate and one
> Object[] parts. The Exceptioncontext is there too but only for
> compatibility with CM when some OrekitExceptions link back to CM
> exceptions. I would be happy to drop it.
>
> And it if is not clear enough since the time, yes you could put the
> I18N elsewhere if you want, as long as you have this two pieces of
> information, the enumerate and the parts available. I think it could
> remain there because it is simple, stable and does need attention
> from developers, but if you insist I18N is evil, just drop it. Don't
> throw the baby out with the bathwater.
>
>>
>> The reused message has lost its
>>> signification. I would really prefer different messages for different
>>> cases
>>> where the fixed part of the format would at least provide some hint to
>>> the
>>> user.
>> Yes I think we are saying the same thing here.
> Fine.
>
>>>
>>>> are doing.  So you start reading. Exception1...nop
>>>> Exception2...nop...Exception3...Exception999..But I think I'm getting
>>>> warmer.  OK - Did not find it ... but I'm fairly certain that there is
>>>> a elegant place for it somewhere in the exception hierarchy...
>>> I agree. The exception hierarchy is a mess.
>> And in all fairness it seemed really elegant at first, and we all went
>> with it.
>>>>
>>>> Handling of Exceptions
>>>> ===================
>>>> If our app uses several of the commons math classes (That throw
>>>> exceptions of the same type), and one of those classes throws an
>>>> exception,what is the app supposed to do?
>>> It depends on the application.
>> But it's possible to architect CM so that the case is that the developer
>> is led down one road only.  And it's like a Hyperloop. They get in, and
>> 7 minutes later, they are sipping a Mojito in Cancun :).
>>
>>> Apache Commons Math is a low level
>>> library it is used in many different contexts and there is no single
>>> answer.
>> Unless there is.
> An answer that requires all intermediate level to pass the error
> above is not an answer to me.
>
>>>
>>>> I think most developers would find that question somewhat challenging.
>>>>   There are numerous strategies.  Catch all exceptions and log what
>>>> happened, etc.  But what if the requirement is that if an exception is
>>>> thrown, the organization that receives it has 0 seconds to get to the
>>>> root cause of it and understand the dynamics. Is this doable? (Yes
>>>> obviously, but how hard is it...?).
>>> It is not Apache Commons Math level of decision.
>> Not right now, but it could be.
> We are not even capable of doing our own regular math job. Lets do
> it before attempting to develop the desing pattern of the next century.
>
>>> We provide the exception
>>> and users can catch it fast. What users do with it is up to them.
>> Sure.  That's how it works right now, but this is causing both more CM
>> core developer overhead and API user overhead than there would be if
>> exceptions are replaced with Enums and throwing exceptions is replaced
>> with a callback.
> A method is developed once in CM and called thousands time in users
> code (several different users by the way). So we trade the development
> time for three lines of code on our sides with the development time
> of a few thousands lines of code on our users. It's not nice.
>
>>>
>>>>
>>>>>>> It seems that the reporting interfaces could quickly overwhelm
>>>>>>> the "actual" code (one type of context per algorithm).
>>>>>> There would one type of Observer interface per algorithm. It would
>>>>>> act on the solution and what are currently exceptions, although these
>>>>>> would be translated into enums.
>>>>> Unless I'm mistaken, the most common use-case for codes implemented
>>>>> in a library such as CM is to provide a correct answer or bail out
>>>>> in a non-equivocal way.
>>>> Most java developers are used to synchronous coding...call the method
>>>> get the response...catch the exception if needed.  This is changing
>>>> with JDK8, and as we evolve and start using lambdas, we become more
>>>> accustomed to the functional callback style of programming.
>>>> Personally I want to be able to use an API that gives me what I need
>>>> when everything works as expected, allows me to resolve unexpected
>>>> issues with minimal effort, and is as simple, fluid, and lightweight
>>>> as possible.
>>>>
>>>>> It would make the code more involved to handle a minority of
>>>>> (undefined) cases. [Actual examples would be welcome in order to
>>>>> focus the discussion.]
>>>> Rough Outline (I've evolved the concept and moved away from the
>>>> OptimizationContext in the process of writing):
>>>>
>>>> interface LevenbergMarquardtObserver {
>>>>
>>>>      public void hola(Solution s);
>>>>      public void sugarHoneyIceTea(ResultType rt, Dianostics d);
>>>> }
>>>>
>>>> public class LMObserver implements LevenbergMarquardtObserver {
>>>>
>>>>     private Application application;
>>>>
>>>>     public LMObserver(Application application) {
>>>>         this.application = application;
>>>>     }
>>>>
>>>>     public void hola(ResultType rt, Solution s) {
>>>>                  application.next(solution);
>>>>     }
>>>>
>>>>     public void sugarHoneyIceTea(ResultType rt, Diagnostic s)
>>>>         if (rt == ResultType.I_GOT_THIS_ONE) {
>>>>              //I looked at the commons unit tests for this algorithm
>>>> evaluating
>>>>              //the diagnostics that shows how this failure can occur
>>>>              //I'm totally fixing this!  Steps aside!
>>>>         }
>>>>         else if (rt == ResultType.REALLY_COMPLICATED_STUFF)
>>>>         {
>>>>             //We need our best engineers...call India.
>>>>         }
>>>>    )
>>>>
>>>>
>>>> public class Application {
>>>>      //Note nothing is returned.
>>>>      LevenberMarquardtOptimizer.setOberver(new
>>>> LMObserver(this)).setLeastSquaresProblem(new
>>>> ClassThatImplementsTheProblem())).start();
>>>>
>>>>      public void next(Solution solution) {
>>>>
>>>>          //Do cool stuff.
>>>>
>>>>      }
>>>> }
>>>>
>>>> Or an asynchronous variation:
>>>>
>>>> public class Application {
>>>> //This call will not block because async is true
>>>>      LevenberMarquardtOptimizer.setAsync(true).setOberver(new
>>>> LMObserver()).setLeastSquaresProblem(new
>>>> ClassThatImplementsTheProblem())).start();
>>>>
>>>>      //Do more stuff right away.
>>>>
>>>>      public void next(Solution solution) {
>>>>          //When the thread running the optimization is done, this
>>>> method is called back.
>>>>          //Do whatever comes next
>>>>      }
>>>> }
>>>>
>>>> The above would start the optimization in a separate thread that does
>>>> not / SHOULD NOT share data with the main thread.
>>>>
>>>>>>> The current reporting is based on exceptions, and assumes that if no
>>>>>>> exception was thrown, then the user's request completed successfully.
>>>>>> Sure - personally I'd much rather deal with something similar to an
>>>>>> HTTP status code in a callback, than an exception .  I think the code
>>>>>> is cleaner and the calback makes it more elegant to apply an adaptive
>>>>>> approach to handling the response, like slightly relaxing constraints,
>>>>>> convergence parameters, etc.  Also by getting rid of the exceptions,
>>>>>> we no longer depend on the I18N layer that they are tied to and now
>>>>>> the messages can be more informative, since they target the root
>>>>>> cause.  The observer can also run in the 'main' thread' while the
>>>>>> optimization can run asynchronously.  Also WRT JDK9 and modules,
>>>>>> loosing the exceptions would mean one less dependency when the library
>>>>>> is up into JDK9 modules...which would be more in line with this
>>>>>> philosophy:
>>>>>> https://github.com/substack/browserify-handbook#module-philosophy
>>>>> I'm not sure I fully understood the philosophy from the text in this
>>>>> short paragraph.
>>>>> But I do not agree with the idea that the possibility to quickly find
>>>>> some code is more important than standards and best practices.
>>>> If you go to npmjs.org and type in Neural Network you will get 56
>>>> results all linked to github repositories.
>>>>
>>>> In addition there's meta data indicating number of downloads in the
>>>> last day, last month, etc.  Try typing in cosine.  Odds are you will
>>>> find a package that does just want you want and nothing else. This is
>>>> very underwhelming and refreshing in terms of cloning off of github
>>>> and getting familar with tests etc.  Also eye opening.  How many of us
>>>> knew that we could do that much stuff with cosine! :).
>>>>
>>>>>>> I totally agree that in some circumstances, more information on the
>>>>>>> inner working of an algorithm would be quite useful.
>>>>>> ... Algorithm iterations become unit testable.
>>>>>>> But I don't see the point in devoting resources to reinvent the
>>>>>>> wheel:
>>>>>> You mean pimping the wheel?  Big pimpin.
>>>>> I think that logging statements are easy to add, not disruptive at all,
>>>>> and come in handy to understand a code's unexpected behaviour.
>>>>> Assuming that a "logging" feature is useful, it can be added *now*
>>>>> using
>>>>> a dependency towards a weight-less (!) framework such as "slf4j".
>>>>> IMO, it would be a waste of time to implement a new communication layer
>>>>> that can do that, and more, if it would be used for logging only in 99%
>>>>> of the cases.
>>>> SLF4J is used by almost every other framework, so why not use it?
>>>> Logging and the diagnostic could be used together.  The primary
>>>> purpose of the diagnostic though is to collect data that will be
>>>> useful in `sugarHoneyIceTea`.
>>>>
>>>>>>> I longed several times for the use of a logging library.
>>>>>>> The only show-stopper has been the informal "no-dependency" policy...
>>>>>> JDK9 Jigsaw should solve dependency hell, so the less coupling
>>>>>> between commons math classes the better.
>>>>> I wouldn't call "coupling" the dependency towards exception classes:
>>>>> they are little utilities that can make sense in various parts of the
>>>>> library.
>>>> If for example the Simplex solver is broken off into it's own module,
>>>> then it has to be coupled to the exceptions, unless it is exception
>>>> free.
>>>>
>>>>> [Unless one wants to embark on yet another discussion about exceptions;
>>>>> whether there should be one class for each of the "messages" that exist
>>>>> in "LocalizedFormats"; whether localization should be done in CM;
>>>>> etc.]
>>>> I think it would be best to just eliminate the exceptions.
>>> NO! A big no!
>> Before the NO is that big, how about examining just a few simple cases
>> where it's done?  You might come around to liking it.  Also it does not
>> need to be a all or nothing.  Some modules could be exception free,
>> whereas others, could throw exceptions because it might be the simplest
>> thing to do and what people are used to.
>>> Apache Commons Math is a very low level library. There are use cases
>>> where you have huge code that relies on math almost everywhere.
>> True dat. But it's quite possible that some of the math that is being
>> used does not actually need to throw the exception.  There are multiple
>> exception categories:
>>
>> - Exceptions thrown by invalid input parameters
>> - Exceptions that signal that either something completely unexpected
>> happened (Even we are like WTF?)
>> - Exceptions that could happen in certain edge conditions that we
>> understand and know how to deal with
>>
>> The first one we can eliminate by supplying a self validating object to
>> the routine.  The next two are the ones that it might really be helpful
>> they are communicated as Enums via a callback.
>>
>> It could be reviewed on a case by case basis, in the a process that
>> considers moving to JDK9 modules.
>>
>>> Look at
>>> the Orekit library for example. We use Vector3D everywhere, we use
>>> Rotation
>>> everywhere, we use ode in many places, we use some linear algebra, we use
>>> optimizers, we use a few statistics, we use root solvers, we use
>>> derivatives,
>>> we use BSP
>> No problem just get rid of the 3D rotation, kill the Vector3D, forget
>> statistics (Way too complicated!), and it's all good :). Kidding
>> obviously, but this does highlight what I'm talking about. If there are
>> tons of CM classes being utilities within a single method, and all of
>> them can throw generic exceptions that cut across multiple classes, the
>> developer has to decode the exception, causing her to waste time.
>>
>>
>>> ... Some of these uses looks as large scale call/return pattern
>>> where you could ask users to check the result afterwards, but many,
>>> many of
>>> the calls are much lower levels and you have literally several
>>> thousands of
>>> calls to math.
>> That's all fine.  CM will still have the exact same workings - it will
>> just be more efficient.
>>
>>> Just think about forcing user to check a vector can be normalized
>>> (i.e. has not 0 norm) everywhere a vector is used. We would end up with
>>> something like:
>>>    Rotation r = new Rotation(a, alpha);
>>>    if (!r.isValid()) {
>>>      // a vector was null
>>>      return error;
>>>   }
>> In this case the client developer could have made sure the vector was
>> not null and valid prior to passing it to a routine.  It's possible that
>> CM will be more lightweight, efficient, and friendly if API users are
>> told that it's their responsibility to validate parameters.  So:
>>
>> RotationContext r = new RotationContext(a, alpha);
>> if (!r.isValid()) {
>>      // Dag Nab It!!!
>> }
>> else {
>>     //Rotate away
>>     Rotation rotation = new Rotation(rotationContext);
>> }
>>
>> Or if we know that the parameters are valid (Mass validation):
>> Rotation rotation = new Rotation(a, alpha);
>>> Repeat the above 4325 times in your code ... welcome back to the 80's.
>> We only have to repeat the validation check 4325 times if there is
>> actually a possibility that the vector is invalid.
> You didn't get what I meant. Exceptions are for, well exceptional cases.
> These cases are often not predictable and are buried in highly
> complex algorithms. So you cannot know for sure you will not have a
> zero vector there. This vector itself is computed from another complex
> algorithm, itself fed by other complex algorithms and so on. There are
> no clear lines : here vectors are known to be valid and here they are
> unknown and should be checked by users beforehand because CM will never
> tell the user something bad occurs when it occurs. Shit happens and
> we need to help users, not tell them " we have warned you, you are
> a bad guy because you did not check your data, but we are good guys
> despite we did not tell you when we saw the bad data, your fault".
>
> You cannot ask users to cripple there code with neither a posteriori
> error checking nor a priori validation at *all* call sites for
> exceptional conditions that occur once every million call. And yes, one
> problem every million call is a big problem when you handle huge data.
> A recent project I worked on was link to the Sentinel 2 Earth
> observation satellite. It generates terabytes of data and every single
> pixel computation needs a lot of geometric transforms (and rotations
> and others). For now we have:
>
>   Rotation r = new Rotation(a, alpha)
>
> or similar things for dot products, interpolators, solvers, optimizers
> and so on in highly critical code that needs to run very very fast.
> Somewhere above (really lots of layers above), we catch exceptions when
> they occur, very rarely (say once every few tens millions pixels, i.e.
> a few times each second or minute) and simply try another much slower
> algorithm for this single pixel, then come back to the regular fast
> algorithm for the next few hundreds millions pixels that are waiting to
> be processed. Basically exceptions allow us to break out of the deep
> layers of algorithms immediately. Continuous pre or post checkings at
> every call sites is impractical and would be much slower. This is akin
> to malloc/free versus garbage collecting.
>
>
> So after all this long mail is centered about one idea:
>
>   Exceptions allow automatic code rerouting without user handling
>   at lots and lots of intermediate levels. Callbacks (or pre-validation
>   which is not always possible) force user handling at all intermediate
>   levels.
>
> best regards,
> Luc
>
>>> Exceptions are meant for, well, exceptional situations.
>> These are exceptional times :)
>>> They avoid people
>>> to cripple the calling code with if(error) statements.
>> We need if(error) at some point.  I agree that if(error) should be
>> minimized.
>>
>>> They ensure (and this
>>> is the most important part) that as soon as the error is detected at very
>>> low level it will be identified and reported back to the upper level
>>> where
>>> it is caught and displayed, without forcing *all* the intermediate levels
>>> to handle it.
>> Yes but this causes the "Root cause analysis deciphering effect" I've
>> been talking about.
>>
>>> When you have complex code with complex algorithms and several
>>> nested levels of calls in different libraries, maintained by different
>>> teams
>>> and usings tens of thousands calls to math, you don't want a
>>> call/return/check error
>>> type of programming like we used to do in the 80's.
>> Ahh the Commodore 64 days...I miss those days...Anyways - no one wants
>> to be hung up on some clown from the 80s.  Right now we are hung up on a
>> clown from the 90s though!  We need a Kim Kardashian API. :)
>>
>> Cheers,
>> - Ole
>>
>>
>>> best regards,
>>> Luc
>>>
>>>>>> Anyways I'm obviously
>>>>>> interested in playing with this stuff, so when I get something up into
>>>>>> a repository I'll to do a callback :).
>>>>> If you are interested in big overhauls, there is one that gathered
>>>>> relative consensus: rewrite the algorithms in a "multithread-friendly"
>>>>> way.
>>>> I think that's a tall order that will take us into JDK88 :). But
>>>> using callbacks and making potentially long running computations
>>>> asynchronous could be a middle ground that would allow simple multi
>>>> threaded use without fiddling around under the hood...
>>>>
>>>>> Some ideas were floated (cf. ML archive) but no implementation or
>>>>> experiment...  Perhaps with a well-defined goal such as performance
>>>>> improvement, your design suggestions will become clearer to more
>>>>> people.
>>>>>
>>>>> AFAIK, only the classes in the "o.a.c.m.neuralnet" package are
>>>>> currently
>>>>> ready to be used with the "java.util.concurrent" framework.
>>>> FWIU Neural Nets are a great fit for concurrency.  I think for the
>>>> others we will end up having discussions around how users would
>>>> control the number of threads, etc. again that makes some of us
>>>> nervous.  An asynchronous operation that runs in one separate thread
>>>> is easier to reason about.  If we want to test 10 neural net
>>>> configurations, and we have 10 cores, then we can start each by itself
>>>> by doing something like:
>>>>
>>>> Nework.setAsync(true).addNeurons().connectNeurons().addObserver(observer).start().
>>>>
>>>> //Now do 10 more
>>>> //If the observer is shared then notifications should be thread safe.
>>>>
>>>> Cheers,
>>>> - Ole
>>>>
>>>> P.S. Dang that was a long email.  If I write one more of these, ban
>>>> me :)
>>>>
>>>>>
>>>>> Best regards,
>>>>> Gilles
>>>>>
>>>>>> Cheers,
>>>>>> Ole
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [hidden email]
>>>>> For additional commands, e-mail: [hidden email]
>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: [hidden email]
>>>> For additional commands, e-mail: [hidden email]
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [Math] LeastSquaresOptimizer Design

Luc Maisonobe-2
Le 24/09/2015 21:40, Ole Ersoy a écrit :

> Hi Luc,
>
> I gave this some more thought, and I think I may have tapped out to
> soon, even though you are absolutely right about what an exception does
> in terms bubbling execution to a point where it stops or we handle it.
>
> Suppose we have an Optimizer and an Optimizer observer.  The optimizer
> will emit three different events given in the process of stepping
> through to the max number of iterations it is allotted:
> - SOLUTION_FOUND
> - COULD_NOT_CONVERGE_FOR_REASON_1
> - COULD_NOT_CONVERGE_FOR_REASON_2
> - END (Max iterations reached)
>
> So we have the observer interface:
>
> interface OptimizerObserver {
>
>     success(Solution solution)
>     update(Enum enum, Optimizer optimizer)
>     end(Optimizer optimizer)
> }
>
> So if the Optimizer notifies the observer of `success`, then the
> observer does what it needs to with the results and moves on.  If the
> observer gets an `update` notification, that means that given the
> current [constraints, numbers of iterations, data] the optimizer cannot
> finish.  But the update method receives the optimizer, so it can adapt
> it, and tell it to continue or just trash it and try something
> completely different.  If the `END` event is reached then the Optimizer
> could not finish given the number of allotted iterations.  The Optimizer
> is passed back via the callback interface so the observer could allow
> more iterations if it wants to...perhaps based on some metric indicating
> how close the optimizer is to finding a solution.
>
> What this could do is allow the implementation of the observer to throw
> the exception if 'All is lost!', in which case the Optimizer does not
> need an exception.  Totally understand that this may not work
> everywhere, but it seems like it could work in this case.
>
> WDYT?

With this version, you should also pass the optimizer in case of
success. In most cases, the observer will just ignore it, but in some
cases it may try to solve another problem, or to solve again with
stricter constraints, using the previous solution as the start point
for the more stringent problem. Another case would be to go from a
simple problem to a more difficult problem using some kind of
homotopy.

best regards,
Luc

>
> Cheers,
> - Ole
>
>
>
> On 09/23/2015 03:09 PM, Luc Maisonobe wrote:
>> Le 23/09/2015 19:20, Ole Ersoy a écrit :
>>> HI Luc,
>> Hi Ole,
>>
>>> On 09/23/2015 03:02 AM, luc wrote:
>>>> Hi,
>>>>
>>>> Le 2015-09-22 02:55, Ole Ersoy a écrit :
>>>>> Hola,
>>>>>
>>>>> On 09/21/2015 04:15 PM, Gilles wrote:
>>>>>> Hi.
>>>>>>
>>>>>> On Sun, 20 Sep 2015 15:04:08 -0500, Ole Ersoy wrote:
>>>>>>> On 09/20/2015 05:51 AM, Gilles wrote:
>>>>>>>> On Sun, 20 Sep 2015 01:12:49 -0500, Ole Ersoy wrote:
>>>>>>>>> Wanted to float some ideas for the LeastSquaresOptimizer (Possibly
>>>>>>>>> General Optimizer) design.  For example with the
>>>>>>>>> LevenbergMarquardtOptimizer we would do:
>>>>>>>>> `LevenbergMarquardtOptimizer.optimize(OptimizationContext c);`
>>>>>>>>>
>>>>>>>>> Rough optimize() outline:
>>>>>>>>> public static void optimise() {
>>>>>>>>> //perform the optimization
>>>>>>>>> //If successful
>>>>>>>>>      c.notify(LevenberMarquardtResultsEnum.SUCCESS, solution);
>>>>>>>>> //If not successful
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_COST_RELATIVE_TOLERANCE,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> diagnostic);
>>>>>>>>> //or
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_PARAMETERS_RELATIVE_TOLERANCE,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> diagnostic)
>>>>>>>>> //etc
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> The diagnostic, when turned on, will contain a trace of the last N
>>>>>>>>> iterations leading up to the failure.  When turned off, the
>>>>>>>>> Diagnostic
>>>>>>>>> instance only contains the parameters used to detect failure. The
>>>>>>>>> diagnostic could be viewed as an indirect way to log optimizer
>>>>>>>>> iterations.
>>>>>>>>>
>>>>>>>>> WDYT?
>>>>>>>> I'm wary of having several different ways to convey information to
>>>>>>>> the
>>>>>>>> caller.
>>>>>>> It would just be one way.
>>>>>> One way for optimizer, one way for solvers, one way for ...
>>>>> Yes I see what you mean, but I think on a whole it will be worth it to
>>>>> add additional sugar code that removes the need for exceptions.
>>>>>
>>>>>>> But the caller may not be the receiver
>>>>>>> (It could be).  The receiver would be an observer attached to the
>>>>>>> OptimizationContext that implements an interface allowing it to
>>>>>>> observe
>>>>>>> the optimization.
>>>>>> I'm afraid that it will add to the questions of what to put in the
>>>>>> code and how.  [We already had sometimes heated discussions just for
>>>>>> the IMHO obvious (e.g. code formatting, documentation,
>>>>>> exception...).]
>>>>> Hehe.  Yes I remember some of these discussions.  I wonder how much
>>>>> time was spent debating the exceptions alone?  Surely everyone must
>>>>> have had this feeling in pit of their stomach that there's got to be a
>>>>> better way.  On the exception topic, these are some of the issues:
>>>>>
>>>>> I18N
>>>>> ===================
>>>>> If you are new to commons math and thinking about designing a commons
>>>>> math compatible exception you should probably understand the I18N
>>>>> stuff that's bound to exception (and wonder why it's bound the the
>>>>> exception).
>>>> Not really true.
>>> Well a lot of things are gray.  Personally if I'm dealing with an API, I
>>> like to understand it, so that there are no surprises.  And I understand
>>> that the I18N coupling might not force me to use it, but if I want to be
>>> smart about my architecture, and simplify my design, then I should look
>>> at it.  Maybe it is a good idea.  Maybe I should just gloss over it?  Am
>>> I being sloppy if I just gloss over it?
>>>
>>> Or is there an alternative that provides the same functionality, or
>>> maybe something better, that does not come with any of these side
>>> effects?
>>>
>>>> The I18N was really simple at start.
>>> Yup I reviewed it and thought - it's probably no big deal - but as I
>>> started looking into reusing the CM exceptions, I decided that it was
>>> not worth the complexity.  I think that if I throw and exception it is
>>> my responsibility to make sure that whomever receives it has a very
>>> simple 0 minute or minimal resolution time for dealing with it, so that
>>> when the code is handed over the client feels confident.  It should work
>>> like a fridge.  We leave the door open too long, a bell goes off, we
>>> close the door.
>>>
>>>> See the one from
>>>> the Orekit project <https://www.orekit.org/forge/projects/orekit/>
>>>> which is
>>>> still in this state, and you will see adding a new message can be done
>>>> really
>>>> simply without a lot of stuff.
>>> Simple is a relative term.  But really simple is different from simple.
>>> Really simple would be the LevenbergMarquardtOptimizer in it's own
>>> module, separated from everything else, with a minimal set of
>>> dependencies.  If it's not this simple, then it quickly grows more
>>> complex as we scale.
>>>
>>>> I18N here is basically one method (getLocalizedString)
>>>> that is never changed in one enumerate class (OrekitMessages), and
>>>> adding a message
>>>> is only adding one entry to the enumerate, that's all.
>>> Sure and when a developer sees that one method, and they know how sharp
>>> the Apache people are (Sincerely), they start looking into reusing this
>>> design.  So if we are going to send them down that path, then that
>>> should be the best path.
>> CM is not intended to be a design pattern people should mimic. We are so
>> bad at this it would be a shame. No one in its right mind would copy
>> or reuse this stuff. It is for internal use only and we don't even have
>> the resources to manage it by ourselves so we can't consider it as a
>> path people should follow as we are leading them. Here we would be
>> leading them directly against the wall.
>>
>>>> The huge pile we have now is only partly related to I18N.
>>> That's what I mean by scaling.  It's like the princess with the pea
>>> below the bottom mattress.  She keeps throwing mattresses on top, and
>>> the pea is still bothering her.  So after the 10th mattress, she decides
>>> they all have to come down until she can find out what's going on.  All
>>> 10 mattresses have to come back off.
>>>
>>>> The context stuff
>>>> was introduced as an attempt to solve some programmatic retrieval of
>>>> information
>>>> at catch time, not related to 18N.
>>> A callback would:
>>> - Provide the same context through a diagnostic
>>> - Decouple from I18N, but still provide the means to retrieve a message.
>>> - Be more tightly coupled to the root cause of the exception / error
>> Yes, but it would require user to take care of the return context and
>> to call it or to pass it above, for every single call. Exception
>> removes intermediate code, callback forces you to handle it by
>> yourself at all call levels, even where you don't want to handle it
>> but simply pass it above as the error management code lies.
>>
>> And this is not only about bubbling the exception out as in providing
>> the data at upper level, which I agree could be done by some global
>> handler. It is *interrupting* the current course of operation that is
>> the tricky part. If you need to interrupt the run as fast as possible
>> when error is detected, either you let the JVM do it for you with
>> exception or you put if statements at each call site (either before the
>> call if you consider users should validate everything or after the
>> call if you don't). By the way, validating input is also not always
>> possible. Of course for the simple zero vector problem I used as an
>> example it is easy, but for slightly different cases it is not.
>> Predicting beforehand that input will generate an error, is the same
>> as the halting problem, which is undecidable (at least for Turing
>> machines). If you want, replace "create a rotation" by "solve a
>> quadratic equation". There, checking that there are no solution is
>> done by computing the discriminant, which is really what the solver
>> would do (if we had a quadratic solver in CM). Things go worth with
>> more complex algorithms, and at the end it appears that a blanket
>> statement saying "users should pre-validate input, thus ensuring
>> no error can occur" is not sustainable.
>>
>>>> The huge hierarchy was introduced to
>>>> go in a direction were one exception = one type.
>>> That's what I was hoping would lead to `one type` equalling the precise
>>> root cause of the exception, but this is not true.  An exception can be
>>> reused in multiple places with multiple different root causes.
>>>
>>>> The ArgUtils was introduced
>>>> due to some Serialization issues. None of this is 18N.
>>> So if we look at this through the lens of:
>>> - Core developer productivity
>>> - API user productivity
>>>
>>> Are these helping?
>> No, they are not. Or rather they are not for me, but maybe other
>> people rely on them, I don't know. This is why I consider we are
>> wrong with our current design.
>>
>>>> Yes our exception is crap. No I18N is not the only responsible.
>>> Honestly until I started reading NodeJS code (About six months ago) I
>>> was super happy with it, and thought that it was this way, because it's
>>> the only way, so that's what we get.  But now I think we can do much
>>> better.
>>>
>>>>> Grab a coffee and spend a few hours, unless you are
>>>>> obviously fairly new to Java like some ofthe people posting for help.
>>>>> In this case when the exception occurs, there is going to be a lot of
>>>>> tutoring going on on the users list.
>>>>>
>>>>> Number of Exceptions
>>>>> ===================
>>>>> Before you do actually design a new exception, you should probably see
>>>>> if there is an exception that already fits the category of what you
>>>> I don't agree.
>>> So you don't think someone should look to reuse an existing
>>> exception...because that would lead to a lot more new exceptions?
>> What I mean is that adding messages is not wrong by itself. They can
>> help users. Ensuring we never duplicate one message takes a lot of
>> developers time, which we don't have, and it does not help users.
>>
>>>> The large list in the enumerate is only a list of
>>>> messages for user display.
>>> The exceptions are one thing and user display is another, and should we
>>> be mixing these?  On the one hand we are saying the CM is a low level
>>> library and that the developer should catch the exceptions, and instruct
>>> the client on how to handle them, and now we are saying that there are
>>> message for user display...
>> No, you are missing the consequences of the "low level" argument. Large
>> applications can use CM at low level, with some less low level code
>> called by intermediate code called by slighty higher level code, itself
>> used by high level code and so on. I don't say direct developers should
>> catch exception directly above CM, as exceptions should be really rare
>> and correspond to ... exceptional situations. This is often handle quite
>> high in the hierarchy for large applications that process large data and
>> should run really fast (I'll put a real life example below). So yes
>> exceptions are caught, but not immediately by the caller, it may be far
>> above and could even be at only one place in some cases, not at
>> thousands of call sites. At this place, yes, sometimes we simply display
>> the error because we don't know what to do. In some other cases, we
>> try to circumvent the problem not by dissecting the exception (so we
>> often don't even look at the embedded Exceptioncontext) but by switching
>> to something different. With exceptions, the intermediate level, which
>> is very important is much simpler. If you don't know how to react to
>> a zero vector at one place because you don't know why this vector is
>> zero, you just don't care, the exception will be automatically forwarded
>> above by the JVM and you will not even see it. You don't have to check
>> for the error just to interrupt yourself the program flow and
>> short-circuit to return immediately. The JVM does it for you. You care
>> about
>> the exception only at two places : where it is thrown (in CM) and where
>> the users know what to do (displaying, stopping or chosing an
>> alternative in the rare cases it is possible). There is no dedicated
>> code in between.
>>
>>
>>> And this is sort of a gray area, so I don't mean to nit pick it so much,
>>> but there's is a cleaner way to do this, that does not mix the user
>>> display concept with the exception / reporting of something different
>>> than a solution.
>> A cleaner way without forcing intermediate code to care about an
>> error it doesn't know what to do with? Even if the intermediate
>> code can only do "if (context.hasError()) return;" it is not clean
>> when multiplied by thousands of occurrences.
>>
>>>> There would really be nothing wrong to add
>>>> even more messages,even if they are close to existing ones.
>>> What if they are exactly the same as the existing ones...because we were
>>> too lazy to scan?  And there's a simple programmatic solution to that,
>>> but do we really want this workflow?
>> Then we will have duplicates. It will cost a few bytes in memory. It
>> will save hours of scarce CM developers time.
>>
>>>> In fact, I
>>>> even think we should avoid trying to reuse messages by merging them all
>>>> in something not meaningful to users (remember, messages are for
>>>> display
>>>> only). Currently we too often get a message like:
>>>>
>>>>    Number is too large, 3 > 2
>>>>
>>>> Nobody understands this at user level.
>>> If instead there's an Enum, tied to the corresponding class, that
>>> represents this condition at the line where it happens, then we can
>>> leave it up to the developer to craft a message that will serve the
>>> client best.
>> Sure. By the way, this is what we have: an enumerate called
>> LocalizedMessages. Call it MessageIdentifier if you prefer and remove
>> the formatting, it would be OK too. The only thing is that in addition
>> to th enumerate we also need some variable parts (like the numbers
>> involved, the dimensions and so on). So this is why we have the object[]
>> parts too. At the beginning this was all we had and it was
>> simple. Then we lost our minds and created a monstrous beast noone
>> can tame.
>>
>>> And I'm not saying that there should not be messages.  Just that they
>>> should be isolated to a specific context, maintained within the
>>> parameters of that context, and looked up by a single unique key only.
>>
>> Look at OrekitException, there is one OrekitMessages enumerate and one
>> Object[] parts. The Exceptioncontext is there too but only for
>> compatibility with CM when some OrekitExceptions link back to CM
>> exceptions. I would be happy to drop it.
>>
>> And it if is not clear enough since the time, yes you could put the
>> I18N elsewhere if you want, as long as you have this two pieces of
>> information, the enumerate and the parts available. I think it could
>> remain there because it is simple, stable and does need attention
>> from developers, but if you insist I18N is evil, just drop it. Don't
>> throw the baby out with the bathwater.
>>
>>>
>>> The reused message has lost its
>>>> signification. I would really prefer different messages for different
>>>> cases
>>>> where the fixed part of the format would at least provide some hint to
>>>> the
>>>> user.
>>> Yes I think we are saying the same thing here.
>> Fine.
>>
>>>>
>>>>> are doing.  So you start reading. Exception1...nop
>>>>> Exception2...nop...Exception3...Exception999..But I think I'm getting
>>>>> warmer.  OK - Did not find it ... but I'm fairly certain that there is
>>>>> a elegant place for it somewhere in the exception hierarchy...
>>>> I agree. The exception hierarchy is a mess.
>>> And in all fairness it seemed really elegant at first, and we all went
>>> with it.
>>>>>
>>>>> Handling of Exceptions
>>>>> ===================
>>>>> If our app uses several of the commons math classes (That throw
>>>>> exceptions of the same type), and one of those classes throws an
>>>>> exception,what is the app supposed to do?
>>>> It depends on the application.
>>> But it's possible to architect CM so that the case is that the developer
>>> is led down one road only.  And it's like a Hyperloop. They get in, and
>>> 7 minutes later, they are sipping a Mojito in Cancun :).
>>>
>>>> Apache Commons Math is a low level
>>>> library it is used in many different contexts and there is no single
>>>> answer.
>>> Unless there is.
>> An answer that requires all intermediate level to pass the error
>> above is not an answer to me.
>>
>>>>
>>>>> I think most developers would find that question somewhat challenging.
>>>>>   There are numerous strategies.  Catch all exceptions and log what
>>>>> happened, etc.  But what if the requirement is that if an exception is
>>>>> thrown, the organization that receives it has 0 seconds to get to the
>>>>> root cause of it and understand the dynamics. Is this doable? (Yes
>>>>> obviously, but how hard is it...?).
>>>> It is not Apache Commons Math level of decision.
>>> Not right now, but it could be.
>> We are not even capable of doing our own regular math job. Lets do
>> it before attempting to develop the desing pattern of the next century.
>>
>>>> We provide the exception
>>>> and users can catch it fast. What users do with it is up to them.
>>> Sure.  That's how it works right now, but this is causing both more CM
>>> core developer overhead and API user overhead than there would be if
>>> exceptions are replaced with Enums and throwing exceptions is replaced
>>> with a callback.
>> A method is developed once in CM and called thousands time in users
>> code (several different users by the way). So we trade the development
>> time for three lines of code on our sides with the development time
>> of a few thousands lines of code on our users. It's not nice.
>>
>>>>
>>>>>
>>>>>>>> It seems that the reporting interfaces could quickly overwhelm
>>>>>>>> the "actual" code (one type of context per algorithm).
>>>>>>> There would one type of Observer interface per algorithm. It would
>>>>>>> act on the solution and what are currently exceptions, although
>>>>>>> these
>>>>>>> would be translated into enums.
>>>>>> Unless I'm mistaken, the most common use-case for codes implemented
>>>>>> in a library such as CM is to provide a correct answer or bail out
>>>>>> in a non-equivocal way.
>>>>> Most java developers are used to synchronous coding...call the method
>>>>> get the response...catch the exception if needed.  This is changing
>>>>> with JDK8, and as we evolve and start using lambdas, we become more
>>>>> accustomed to the functional callback style of programming.
>>>>> Personally I want to be able to use an API that gives me what I need
>>>>> when everything works as expected, allows me to resolve unexpected
>>>>> issues with minimal effort, and is as simple, fluid, and lightweight
>>>>> as possible.
>>>>>
>>>>>> It would make the code more involved to handle a minority of
>>>>>> (undefined) cases. [Actual examples would be welcome in order to
>>>>>> focus the discussion.]
>>>>> Rough Outline (I've evolved the concept and moved away from the
>>>>> OptimizationContext in the process of writing):
>>>>>
>>>>> interface LevenbergMarquardtObserver {
>>>>>
>>>>>      public void hola(Solution s);
>>>>>      public void sugarHoneyIceTea(ResultType rt, Dianostics d);
>>>>> }
>>>>>
>>>>> public class LMObserver implements LevenbergMarquardtObserver {
>>>>>
>>>>>     private Application application;
>>>>>
>>>>>     public LMObserver(Application application) {
>>>>>         this.application = application;
>>>>>     }
>>>>>
>>>>>     public void hola(ResultType rt, Solution s) {
>>>>>                  application.next(solution);
>>>>>     }
>>>>>
>>>>>     public void sugarHoneyIceTea(ResultType rt, Diagnostic s)
>>>>>         if (rt == ResultType.I_GOT_THIS_ONE) {
>>>>>              //I looked at the commons unit tests for this algorithm
>>>>> evaluating
>>>>>              //the diagnostics that shows how this failure can occur
>>>>>              //I'm totally fixing this!  Steps aside!
>>>>>         }
>>>>>         else if (rt == ResultType.REALLY_COMPLICATED_STUFF)
>>>>>         {
>>>>>             //We need our best engineers...call India.
>>>>>         }
>>>>>    )
>>>>>
>>>>>
>>>>> public class Application {
>>>>>      //Note nothing is returned.
>>>>>      LevenberMarquardtOptimizer.setOberver(new
>>>>> LMObserver(this)).setLeastSquaresProblem(new
>>>>> ClassThatImplementsTheProblem())).start();
>>>>>
>>>>>      public void next(Solution solution) {
>>>>>
>>>>>          //Do cool stuff.
>>>>>
>>>>>      }
>>>>> }
>>>>>
>>>>> Or an asynchronous variation:
>>>>>
>>>>> public class Application {
>>>>> //This call will not block because async is true
>>>>>      LevenberMarquardtOptimizer.setAsync(true).setOberver(new
>>>>> LMObserver()).setLeastSquaresProblem(new
>>>>> ClassThatImplementsTheProblem())).start();
>>>>>
>>>>>      //Do more stuff right away.
>>>>>
>>>>>      public void next(Solution solution) {
>>>>>          //When the thread running the optimization is done, this
>>>>> method is called back.
>>>>>          //Do whatever comes next
>>>>>      }
>>>>> }
>>>>>
>>>>> The above would start the optimization in a separate thread that does
>>>>> not / SHOULD NOT share data with the main thread.
>>>>>
>>>>>>>> The current reporting is based on exceptions, and assumes that
>>>>>>>> if no
>>>>>>>> exception was thrown, then the user's request completed
>>>>>>>> successfully.
>>>>>>> Sure - personally I'd much rather deal with something similar to an
>>>>>>> HTTP status code in a callback, than an exception .  I think the
>>>>>>> code
>>>>>>> is cleaner and the calback makes it more elegant to apply an
>>>>>>> adaptive
>>>>>>> approach to handling the response, like slightly relaxing
>>>>>>> constraints,
>>>>>>> convergence parameters, etc.  Also by getting rid of the exceptions,
>>>>>>> we no longer depend on the I18N layer that they are tied to and now
>>>>>>> the messages can be more informative, since they target the root
>>>>>>> cause.  The observer can also run in the 'main' thread' while the
>>>>>>> optimization can run asynchronously.  Also WRT JDK9 and modules,
>>>>>>> loosing the exceptions would mean one less dependency when the
>>>>>>> library
>>>>>>> is up into JDK9 modules...which would be more in line with this
>>>>>>> philosophy:
>>>>>>> https://github.com/substack/browserify-handbook#module-philosophy
>>>>>> I'm not sure I fully understood the philosophy from the text in this
>>>>>> short paragraph.
>>>>>> But I do not agree with the idea that the possibility to quickly find
>>>>>> some code is more important than standards and best practices.
>>>>> If you go to npmjs.org and type in Neural Network you will get 56
>>>>> results all linked to github repositories.
>>>>>
>>>>> In addition there's meta data indicating number of downloads in the
>>>>> last day, last month, etc.  Try typing in cosine.  Odds are you will
>>>>> find a package that does just want you want and nothing else. This is
>>>>> very underwhelming and refreshing in terms of cloning off of github
>>>>> and getting familar with tests etc.  Also eye opening.  How many of us
>>>>> knew that we could do that much stuff with cosine! :).
>>>>>
>>>>>>>> I totally agree that in some circumstances, more information on the
>>>>>>>> inner working of an algorithm would be quite useful.
>>>>>>> ... Algorithm iterations become unit testable.
>>>>>>>> But I don't see the point in devoting resources to reinvent the
>>>>>>>> wheel:
>>>>>>> You mean pimping the wheel?  Big pimpin.
>>>>>> I think that logging statements are easy to add, not disruptive at
>>>>>> all,
>>>>>> and come in handy to understand a code's unexpected behaviour.
>>>>>> Assuming that a "logging" feature is useful, it can be added *now*
>>>>>> using
>>>>>> a dependency towards a weight-less (!) framework such as "slf4j".
>>>>>> IMO, it would be a waste of time to implement a new communication
>>>>>> layer
>>>>>> that can do that, and more, if it would be used for logging only
>>>>>> in 99%
>>>>>> of the cases.
>>>>> SLF4J is used by almost every other framework, so why not use it?
>>>>> Logging and the diagnostic could be used together.  The primary
>>>>> purpose of the diagnostic though is to collect data that will be
>>>>> useful in `sugarHoneyIceTea`.
>>>>>
>>>>>>>> I longed several times for the use of a logging library.
>>>>>>>> The only show-stopper has been the informal "no-dependency"
>>>>>>>> policy...
>>>>>>> JDK9 Jigsaw should solve dependency hell, so the less coupling
>>>>>>> between commons math classes the better.
>>>>>> I wouldn't call "coupling" the dependency towards exception classes:
>>>>>> they are little utilities that can make sense in various parts of the
>>>>>> library.
>>>>> If for example the Simplex solver is broken off into it's own module,
>>>>> then it has to be coupled to the exceptions, unless it is exception
>>>>> free.
>>>>>
>>>>>> [Unless one wants to embark on yet another discussion about
>>>>>> exceptions;
>>>>>> whether there should be one class for each of the "messages" that
>>>>>> exist
>>>>>> in "LocalizedFormats"; whether localization should be done in CM;
>>>>>> etc.]
>>>>> I think it would be best to just eliminate the exceptions.
>>>> NO! A big no!
>>> Before the NO is that big, how about examining just a few simple cases
>>> where it's done?  You might come around to liking it.  Also it does not
>>> need to be a all or nothing.  Some modules could be exception free,
>>> whereas others, could throw exceptions because it might be the simplest
>>> thing to do and what people are used to.
>>>> Apache Commons Math is a very low level library. There are use cases
>>>> where you have huge code that relies on math almost everywhere.
>>> True dat. But it's quite possible that some of the math that is being
>>> used does not actually need to throw the exception.  There are multiple
>>> exception categories:
>>>
>>> - Exceptions thrown by invalid input parameters
>>> - Exceptions that signal that either something completely unexpected
>>> happened (Even we are like WTF?)
>>> - Exceptions that could happen in certain edge conditions that we
>>> understand and know how to deal with
>>>
>>> The first one we can eliminate by supplying a self validating object to
>>> the routine.  The next two are the ones that it might really be helpful
>>> they are communicated as Enums via a callback.
>>>
>>> It could be reviewed on a case by case basis, in the a process that
>>> considers moving to JDK9 modules.
>>>
>>>> Look at
>>>> the Orekit library for example. We use Vector3D everywhere, we use
>>>> Rotation
>>>> everywhere, we use ode in many places, we use some linear algebra,
>>>> we use
>>>> optimizers, we use a few statistics, we use root solvers, we use
>>>> derivatives,
>>>> we use BSP
>>> No problem just get rid of the 3D rotation, kill the Vector3D, forget
>>> statistics (Way too complicated!), and it's all good :). Kidding
>>> obviously, but this does highlight what I'm talking about. If there are
>>> tons of CM classes being utilities within a single method, and all of
>>> them can throw generic exceptions that cut across multiple classes, the
>>> developer has to decode the exception, causing her to waste time.
>>>
>>>
>>>> ... Some of these uses looks as large scale call/return pattern
>>>> where you could ask users to check the result afterwards, but many,
>>>> many of
>>>> the calls are much lower levels and you have literally several
>>>> thousands of
>>>> calls to math.
>>> That's all fine.  CM will still have the exact same workings - it will
>>> just be more efficient.
>>>
>>>> Just think about forcing user to check a vector can be normalized
>>>> (i.e. has not 0 norm) everywhere a vector is used. We would end up with
>>>> something like:
>>>>    Rotation r = new Rotation(a, alpha);
>>>>    if (!r.isValid()) {
>>>>      // a vector was null
>>>>      return error;
>>>>   }
>>> In this case the client developer could have made sure the vector was
>>> not null and valid prior to passing it to a routine.  It's possible that
>>> CM will be more lightweight, efficient, and friendly if API users are
>>> told that it's their responsibility to validate parameters.  So:
>>>
>>> RotationContext r = new RotationContext(a, alpha);
>>> if (!r.isValid()) {
>>>      // Dag Nab It!!!
>>> }
>>> else {
>>>     //Rotate away
>>>     Rotation rotation = new Rotation(rotationContext);
>>> }
>>>
>>> Or if we know that the parameters are valid (Mass validation):
>>> Rotation rotation = new Rotation(a, alpha);
>>>> Repeat the above 4325 times in your code ... welcome back to the 80's.
>>> We only have to repeat the validation check 4325 times if there is
>>> actually a possibility that the vector is invalid.
>> You didn't get what I meant. Exceptions are for, well exceptional cases.
>> These cases are often not predictable and are buried in highly
>> complex algorithms. So you cannot know for sure you will not have a
>> zero vector there. This vector itself is computed from another complex
>> algorithm, itself fed by other complex algorithms and so on. There are
>> no clear lines : here vectors are known to be valid and here they are
>> unknown and should be checked by users beforehand because CM will never
>> tell the user something bad occurs when it occurs. Shit happens and
>> we need to help users, not tell them " we have warned you, you are
>> a bad guy because you did not check your data, but we are good guys
>> despite we did not tell you when we saw the bad data, your fault".
>>
>> You cannot ask users to cripple there code with neither a posteriori
>> error checking nor a priori validation at *all* call sites for
>> exceptional conditions that occur once every million call. And yes, one
>> problem every million call is a big problem when you handle huge data.
>> A recent project I worked on was link to the Sentinel 2 Earth
>> observation satellite. It generates terabytes of data and every single
>> pixel computation needs a lot of geometric transforms (and rotations
>> and others). For now we have:
>>
>>   Rotation r = new Rotation(a, alpha)
>>
>> or similar things for dot products, interpolators, solvers, optimizers
>> and so on in highly critical code that needs to run very very fast.
>> Somewhere above (really lots of layers above), we catch exceptions when
>> they occur, very rarely (say once every few tens millions pixels, i.e.
>> a few times each second or minute) and simply try another much slower
>> algorithm for this single pixel, then come back to the regular fast
>> algorithm for the next few hundreds millions pixels that are waiting to
>> be processed. Basically exceptions allow us to break out of the deep
>> layers of algorithms immediately. Continuous pre or post checkings at
>> every call sites is impractical and would be much slower. This is akin
>> to malloc/free versus garbage collecting.
>>
>>
>> So after all this long mail is centered about one idea:
>>
>>   Exceptions allow automatic code rerouting without user handling
>>   at lots and lots of intermediate levels. Callbacks (or pre-validation
>>   which is not always possible) force user handling at all intermediate
>>   levels.
>>
>> best regards,
>> Luc
>>
>>>> Exceptions are meant for, well, exceptional situations.
>>> These are exceptional times :)
>>>> They avoid people
>>>> to cripple the calling code with if(error) statements.
>>> We need if(error) at some point.  I agree that if(error) should be
>>> minimized.
>>>
>>>> They ensure (and this
>>>> is the most important part) that as soon as the error is detected at
>>>> very
>>>> low level it will be identified and reported back to the upper level
>>>> where
>>>> it is caught and displayed, without forcing *all* the intermediate
>>>> levels
>>>> to handle it.
>>> Yes but this causes the "Root cause analysis deciphering effect" I've
>>> been talking about.
>>>
>>>> When you have complex code with complex algorithms and several
>>>> nested levels of calls in different libraries, maintained by different
>>>> teams
>>>> and usings tens of thousands calls to math, you don't want a
>>>> call/return/check error
>>>> type of programming like we used to do in the 80's.
>>> Ahh the Commodore 64 days...I miss those days...Anyways - no one wants
>>> to be hung up on some clown from the 80s.  Right now we are hung up on a
>>> clown from the 90s though!  We need a Kim Kardashian API. :)
>>>
>>> Cheers,
>>> - Ole
>>>
>>>
>>>> best regards,
>>>> Luc
>>>>
>>>>>>> Anyways I'm obviously
>>>>>>> interested in playing with this stuff, so when I get something up
>>>>>>> into
>>>>>>> a repository I'll to do a callback :).
>>>>>> If you are interested in big overhauls, there is one that gathered
>>>>>> relative consensus: rewrite the algorithms in a
>>>>>> "multithread-friendly"
>>>>>> way.
>>>>> I think that's a tall order that will take us into JDK88 :). But
>>>>> using callbacks and making potentially long running computations
>>>>> asynchronous could be a middle ground that would allow simple multi
>>>>> threaded use without fiddling around under the hood...
>>>>>
>>>>>> Some ideas were floated (cf. ML archive) but no implementation or
>>>>>> experiment...  Perhaps with a well-defined goal such as performance
>>>>>> improvement, your design suggestions will become clearer to more
>>>>>> people.
>>>>>>
>>>>>> AFAIK, only the classes in the "o.a.c.m.neuralnet" package are
>>>>>> currently
>>>>>> ready to be used with the "java.util.concurrent" framework.
>>>>> FWIU Neural Nets are a great fit for concurrency.  I think for the
>>>>> others we will end up having discussions around how users would
>>>>> control the number of threads, etc. again that makes some of us
>>>>> nervous.  An asynchronous operation that runs in one separate thread
>>>>> is easier to reason about.  If we want to test 10 neural net
>>>>> configurations, and we have 10 cores, then we can start each by itself
>>>>> by doing something like:
>>>>>
>>>>> Nework.setAsync(true).addNeurons().connectNeurons().addObserver(observer).start().
>>>>>
>>>>>
>>>>> //Now do 10 more
>>>>> //If the observer is shared then notifications should be thread safe.
>>>>>
>>>>> Cheers,
>>>>> - Ole
>>>>>
>>>>> P.S. Dang that was a long email.  If I write one more of these, ban
>>>>> me :)
>>>>>
>>>>>>
>>>>>> Best regards,
>>>>>> Gilles
>>>>>>
>>>>>>> Cheers,
>>>>>>> Ole
>>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: [hidden email]
>>>>>> For additional commands, e-mail: [hidden email]
>>>>>>
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [hidden email]
>>>>> For additional commands, e-mail: [hidden email]
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: [hidden email]
>>>> For additional commands, e-mail: [hidden email]
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

12