[jira] Created: (MATH-350) Regression in package "regression"

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (MATH-350) Regression in package "regression"

Gary D. Gregory (Jira)
Regression in package "regression"
----------------------------------

                 Key: MATH-350
                 URL: https://issues.apache.org/jira/browse/MATH-350
             Project: Commons Math
          Issue Type: Bug
    Affects Versions: 2.0
            Reporter: Gilles


There is a regression in class "OLSMultipleLinearRegression".

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (MATH-350) Regression in package "regression"

Gary D. Gregory (Jira)

     [ https://issues.apache.org/jira/browse/MATH-350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gilles updated MATH-350:
------------------------

    Attachment: OLSRegressionCompare20To21Test.java

Unit test from the user who discovered the issue. Copying it into the
  src/test/java/org/apache/commons/math/stat/regression
directory of the CM source tree, and running the tests will result in the following output:

{noformat}
Running org.apache.commons.math.stat.regression.OLSRegressionCompare20To21Test
Test parameters
Test variance
Expected 2.7104787386850834E-26 Actual 1.091646231028372E-25
Constructed a polynomial of degree 2
System will model various degrees of polynomial and use an Ftest to find the best model


Testing trend of degree 1
OLSRegressionCompare20To21Test: model statistic: 1545.9235701200896 threshold 3.9381110982233225
Model with degree 1 is better - keep testing higher order models

Testing trend of degree 2
OLSRegressionCompare20To21Test: model statistic: 8.9522618877706E31 threshold 3.939126144339758
Model with degree 2 is better - keep testing higher order models

Testing trend of degree 3
OLSRegressionCompare20To21Test: model statistic: 109.12572361790322 threshold 3.940162735266305
Model with degree 3 is better - keep testing higher order models

Testing trend of degree 4
OLSRegressionCompare20To21Test: model statistic: -40.823878611285174 threshold 3.941221564253479
Model with degree 4 is rejected - keeping simple model and exiting
Best model found with degree = 3
Coeff 0 = 3.000000000000148
Coeff 1 = 1.2000000000000006
Coeff 2 = 0.3399999999999999
Coeff 3 = 1.0632735348660702E-18
Test residuals
Test standard errors
Test hat matrix
Test parameter variance
{noformat}

So, although the last coefficient is nearly zero, it is conceptually wrong to return a fit with a polynomial of degree 3 whereas the input data was generated from a polynomial of degree 2.
[CM 2.0 behave properly in this respect.]


> Regression in package "regression"
> ----------------------------------
>
>                 Key: MATH-350
>                 URL: https://issues.apache.org/jira/browse/MATH-350
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 2.0
>            Reporter: Gilles
>         Attachments: OLSRegressionCompare20To21Test.java
>
>
> There is a regression in class "OLSMultipleLinearRegression".

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Issue Comment Edited: (MATH-350) Regression in package "regression"

Gary D. Gregory (Jira)
In reply to this post by Gary D. Gregory (Jira)

    [ https://issues.apache.org/jira/browse/MATH-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844022#action_12844022 ]

Gilles edited comment on MATH-350 at 3/11/10 11:18 AM:
-------------------------------------------------------

Unit test from the user who discovered the issue. Copying it into the
  src/test/java/org/apache/commons/math/stat/regression
directory of the CM source tree, and running the tests will result in the following output:

{noformat}
Running org.apache.commons.math.stat.regression.OLSRegressionCompare20To21Test
Test parameters
Test variance
Expected 2.7104787386850834E-26 Actual 1.091646231028372E-25
Constructed a polynomial of degree 2
System will model various degrees of polynomial and use an Ftest to find the best model


Testing trend of degree 1
OLSRegressionCompare20To21Test: model statistic: 1545.9235701200896 threshold 3.9381110982233225
Model with degree 1 is better - keep testing higher order models

Testing trend of degree 2
OLSRegressionCompare20To21Test: model statistic: 8.9522618877706E31 threshold 3.939126144339758
Model with degree 2 is better - keep testing higher order models

Testing trend of degree 3
OLSRegressionCompare20To21Test: model statistic: 109.12572361790322 threshold 3.940162735266305
Model with degree 3 is better - keep testing higher order models

Testing trend of degree 4
OLSRegressionCompare20To21Test: model statistic: -40.823878611285174 threshold 3.941221564253479
Model with degree 4 is rejected - keeping simple model and exiting
Best model found with degree = 3
Coeff 0 = 3.000000000000148
Coeff 1 = 1.2000000000000006
Coeff 2 = 0.3399999999999999
Coeff 3 = 1.0632735348660702E-18
Test residuals
Test standard errors
Test hat matrix
Test parameter variance
{noformat}

So, although the last coefficient is nearly zero, it is conceptually wrong to return a fit with a polynomial of degree 3 whereas the input data was generated from a polynomial of degree 2.
[CM 2.0 behaved properly in this respect.]


      was (Author: erans):
    Unit test from the user who discovered the issue. Copying it into the
  src/test/java/org/apache/commons/math/stat/regression
directory of the CM source tree, and running the tests will result in the following output:

{noformat}
Running org.apache.commons.math.stat.regression.OLSRegressionCompare20To21Test
Test parameters
Test variance
Expected 2.7104787386850834E-26 Actual 1.091646231028372E-25
Constructed a polynomial of degree 2
System will model various degrees of polynomial and use an Ftest to find the best model


Testing trend of degree 1
OLSRegressionCompare20To21Test: model statistic: 1545.9235701200896 threshold 3.9381110982233225
Model with degree 1 is better - keep testing higher order models

Testing trend of degree 2
OLSRegressionCompare20To21Test: model statistic: 8.9522618877706E31 threshold 3.939126144339758
Model with degree 2 is better - keep testing higher order models

Testing trend of degree 3
OLSRegressionCompare20To21Test: model statistic: 109.12572361790322 threshold 3.940162735266305
Model with degree 3 is better - keep testing higher order models

Testing trend of degree 4
OLSRegressionCompare20To21Test: model statistic: -40.823878611285174 threshold 3.941221564253479
Model with degree 4 is rejected - keeping simple model and exiting
Best model found with degree = 3
Coeff 0 = 3.000000000000148
Coeff 1 = 1.2000000000000006
Coeff 2 = 0.3399999999999999
Coeff 3 = 1.0632735348660702E-18
Test residuals
Test standard errors
Test hat matrix
Test parameter variance
{noformat}

So, although the last coefficient is nearly zero, it is conceptually wrong to return a fit with a polynomial of degree 3 whereas the input data was generated from a polynomial of degree 2.
[CM 2.0 behave properly in this respect.]

 

> Regression in package "regression"
> ----------------------------------
>
>                 Key: MATH-350
>                 URL: https://issues.apache.org/jira/browse/MATH-350
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 2.0
>            Reporter: Gilles
>         Attachments: OLSRegressionCompare20To21Test.java
>
>
> There is a regression in class "OLSMultipleLinearRegression".

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (MATH-350) Regression in package "regression"

Gary D. Gregory (Jira)
In reply to this post by Gary D. Gregory (Jira)

    [ https://issues.apache.org/jira/browse/MATH-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844027#action_12844027 ]

Gilles commented on MATH-350:
-----------------------------

Examining the history:
{noformat}
r825925 | luc | 2009-10-16 17:11:47 +0200 (Fri, 16 Oct 2009) | 1 line

replaced custom linear solve computation by use of the linear package features
{noformat}

The diff shows:
{noformat}
-        return solveUpperTriangular(qr.getR(), qr.getQ().transpose().operate(Y));
+        return qr.getSolver().solve(Y);
{noformat}


> Regression in package "regression"
> ----------------------------------
>
>                 Key: MATH-350
>                 URL: https://issues.apache.org/jira/browse/MATH-350
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 2.0
>            Reporter: Gilles
>         Attachments: OLSRegressionCompare20To21Test.java
>
>
> There is a regression in class "OLSMultipleLinearRegression".

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Closed: (MATH-350) Regression in package "regression"

Gary D. Gregory (Jira)
In reply to this post by Gary D. Gregory (Jira)

     [ https://issues.apache.org/jira/browse/MATH-350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gilles closed MATH-350.
-----------------------

    Resolution: Incomplete

The initial reporter identified changes (of the order of 1e-13) in the values computed by the current code and those from 2.0. However we cannot assert which ones are "better" at this point.
The provided test does not clearly points to some deficiency in the CM source code.


> Regression in package "regression"
> ----------------------------------
>
>                 Key: MATH-350
>                 URL: https://issues.apache.org/jira/browse/MATH-350
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 2.0
>            Reporter: Gilles
>         Attachments: OLSRegressionCompare20To21Test.java
>
>
> There is a regression in class "OLSMultipleLinearRegression".

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.