[jira] [Commented] (MATH-891) SpearmansCorrelation fails when using NaturalRanking together with NaNStrategy.REMOVED

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (MATH-891) SpearmansCorrelation fails when using NaturalRanking together with NaNStrategy.REMOVED

Gary D. Gregory (Jira)

    [ https://issues.apache.org/jira/browse/MATH-891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13615500#comment-13615500 ]

Phil Steitz commented on MATH-891:
----------------------------------

I am OK with committing this patch, but lets keep the issue open, or open another one for handling missing data in multivariate stats.  I think it is OK to leave the RankingAlgorithm interface and impls as is - they are doing what they should be doing by contract.  I think multivariate stats should just not allow the REMOVED NAN strategy (i.e. throw in this case).  Once we agree on how to implement and represent missing data strategies at least just for this class, the Spearman's constructor should then be modified to include specification of missing data strategy.

I think it is better to commit the workaround now, since behavior is currently broken; but note in the javadoc and release notes that as of 4.0, the constructor will throw on REMOVED NaNStrategy and NANs should not be used to represent missing data.  Practical advice to users is to preprocess data to remove / replace / impute missing data in preparation for this.
               

> SpearmansCorrelation fails when using NaturalRanking together with NaNStrategy.REMOVED
> --------------------------------------------------------------------------------------
>
>                 Key: MATH-891
>                 URL: https://issues.apache.org/jira/browse/MATH-891
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 3.0
>            Reporter: Thomas Neidhart
>         Attachments: MATH-891.patch
>
>
> As reported by Martin Rosellen on the users mailinglist:
> Using a NaturalRanking with a REMOVED NaNStrategy can result in an exception when NaN are contained in the input arrays.
> The current implementation just removes the NaN values where they occur, without taken care to remove the corresponding values in the other array.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira