Better implementation for the gamma distribution implementation
--------------------------------------------------------------- Key: MATH-753 URL: https://issues.apache.org/jira/browse/MATH-753 Project: Commons Math Issue Type: Improvement Affects Versions: 2.2 Reporter: Francesco Strino Priority: Minor Fix For: 2.2 The way the density of the gamma distribution function is estimated can be improved. It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. It would be sufficient to change line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows (I write it down here because I don't know how to submit a patch with JIRA): @Override public double density(double x) { if (x < 0) return 0; //return Math.pow(x / beta, alpha - 1) / beta * Math.exp(-x / beta) / Math.exp(Gamma.logGamma(alpha)); //unstable for large values return Math.exp(Math.log(x)*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
[ https://issues.apache.org/jira/browse/MATH-753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francesco Strino updated MATH-753: ---------------------------------- Description: The way the density of the gamma distribution function is estimated can be improved. It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); was: The way the density of the gamma distribution function is estimated can be improved. It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: return Math.exp(Math.log(x)*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); Summary: Better implementation for the gamma distribution density function (was: Better implementation for the gamma distribution implementation) > Better implementation for the gamma distribution density function > ----------------------------------------------------------------- > > Key: MATH-753 > URL: https://issues.apache.org/jira/browse/MATH-753 > Project: Commons Math > Issue Type: Improvement > Affects Versions: 2.2 > Reporter: Francesco Strino > Priority: Minor > Labels: improvement, stability > Fix For: 2.2 > > > The way the density of the gamma distribution function is estimated can be improved. > It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. > It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: > return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
In reply to this post by ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/MATH-753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francesco Strino updated MATH-753: ---------------------------------- Description: The way the density of the gamma distribution function is estimated can be improved. It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: return Math.exp(Math.log(x)*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); was: The way the density of the gamma distribution function is estimated can be improved. It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. It would be sufficient to change line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows (I write it down here because I don't know how to submit a patch with JIRA): @Override public double density(double x) { if (x < 0) return 0; //return Math.pow(x / beta, alpha - 1) / beta * Math.exp(-x / beta) / Math.exp(Gamma.logGamma(alpha)); //unstable for large values return Math.exp(Math.log(x)*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); } > Better implementation for the gamma distribution implementation > --------------------------------------------------------------- > > Key: MATH-753 > URL: https://issues.apache.org/jira/browse/MATH-753 > Project: Commons Math > Issue Type: Improvement > Affects Versions: 2.2 > Reporter: Francesco Strino > Priority: Minor > Labels: improvement, stability > Fix For: 2.2 > > > The way the density of the gamma distribution function is estimated can be improved. > It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. > It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: > return Math.exp(Math.log(x)*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
In reply to this post by ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/MATH-753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francesco Strino updated MATH-753: ---------------------------------- Description: The way the density of the gamma distribution function is estimated can be improved. It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); In order to improve performance, log(beta) and Gamma.logGamma(alpha) could also be precomputed and stored during initialization. was: The way the density of the gamma distribution function is estimated can be improved. It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); > Better implementation for the gamma distribution density function > ----------------------------------------------------------------- > > Key: MATH-753 > URL: https://issues.apache.org/jira/browse/MATH-753 > Project: Commons Math > Issue Type: Improvement > Affects Versions: 2.2 > Reporter: Francesco Strino > Priority: Minor > Labels: improvement, stability > Fix For: 2.2 > > > The way the density of the gamma distribution function is estimated can be improved. > It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. > It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: > return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); > In order to improve performance, log(beta) and Gamma.logGamma(alpha) could also be precomputed and stored during initialization. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
In reply to this post by ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/MATH-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216359#comment-13216359 ] Sébastien Brisard commented on MATH-753: ---------------------------------------- Thanks Francesco for reporting this issue. The fix you propose is indeed very simple. However, we should make sure that there is no loss of accuracy for smaller values of alpha and beta. I'd like to know what others think. Also, we are trying to stabilize version 3.0, so we do not have much time to investigate this right now. Two options would be possible # wait until 3.1 to include this fix, # include this fix right now, but keep the ticket open, in order to thoroughly investigate accurracy issues in 3.1. > Better implementation for the gamma distribution density function > ----------------------------------------------------------------- > > Key: MATH-753 > URL: https://issues.apache.org/jira/browse/MATH-753 > Project: Commons Math > Issue Type: Improvement > Affects Versions: 2.2 > Reporter: Francesco Strino > Priority: Minor > Labels: improvement, stability > Fix For: 2.2 > > > The way the density of the gamma distribution function is estimated can be improved. > It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. > It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: > return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); > In order to improve performance, log(beta) and Gamma.logGamma(alpha) could also be precomputed and stored during initialization. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
In reply to this post by ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/MATH-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216391#comment-13216391 ] Gilles commented on MATH-753: ----------------------------- bq. wait until 3.1 to include this fix, +1 bq. include this fix right now, but keep the ticket open, in order to thoroughly investigate accurracy issues in 3.1. -1 Let's freeze the "trunk" as proposed by Luc. This is an enhancement that can wait a few days, the more so that you suspect that there could be side-effects. > Better implementation for the gamma distribution density function > ----------------------------------------------------------------- > > Key: MATH-753 > URL: https://issues.apache.org/jira/browse/MATH-753 > Project: Commons Math > Issue Type: Improvement > Affects Versions: 2.2 > Reporter: Francesco Strino > Priority: Minor > Labels: improvement, stability > Fix For: 2.2 > > > The way the density of the gamma distribution function is estimated can be improved. > It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. > It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: > return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); > In order to improve performance, log(beta) and Gamma.logGamma(alpha) could also be precomputed and stored during initialization. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
In reply to this post by ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/MATH-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216529#comment-13216529 ] Sébastien Brisard commented on MATH-753: ---------------------------------------- This suits me fine. The only concern I have is that using exp(log(x)) in place of x might incur a loss of accuracy. Maybe we should use this substitution only when it is necessary (for large values of alpha and beta). This would require a little bit of investigation to find the appropriate thresholds. > Better implementation for the gamma distribution density function > ----------------------------------------------------------------- > > Key: MATH-753 > URL: https://issues.apache.org/jira/browse/MATH-753 > Project: Commons Math > Issue Type: Improvement > Affects Versions: 2.2 > Reporter: Francesco Strino > Priority: Minor > Labels: improvement, stability > Fix For: 2.2 > > > The way the density of the gamma distribution function is estimated can be improved. > It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. > It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: > return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); > In order to improve performance, log(beta) and Gamma.logGamma(alpha) could also be precomputed and stored during initialization. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
In reply to this post by ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/MATH-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216529#comment-13216529 ] Sébastien Brisard edited comment on MATH-753 at 2/25/12 8:19 PM: ----------------------------------------------------------------- This suits me fine. The only concern I have is that using exp(log( x )) in place of x might incur a loss of accuracy. Maybe we should use this substitution only when it is necessary (for large values of alpha and beta). This would require a little bit of investigation to find the appropriate thresholds. was (Author: celestin): This suits me fine. The only concern I have is that using exp(log(x)) in place of x might incur a loss of accuracy. Maybe we should use this substitution only when it is necessary (for large values of alpha and beta). This would require a little bit of investigation to find the appropriate thresholds. > Better implementation for the gamma distribution density function > ----------------------------------------------------------------- > > Key: MATH-753 > URL: https://issues.apache.org/jira/browse/MATH-753 > Project: Commons Math > Issue Type: Improvement > Affects Versions: 2.2 > Reporter: Francesco Strino > Priority: Minor > Labels: improvement, stability > Fix For: 2.2 > > > The way the density of the gamma distribution function is estimated can be improved. > It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. > It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: > return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); > In order to improve performance, log(beta) and Gamma.logGamma(alpha) could also be precomputed and stored during initialization. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
In reply to this post by ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/MATH-753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sébastien Brisard updated MATH-753: ----------------------------------- Affects Version/s: 3.1 3.0 Fix Version/s: (was: 2.2) 3.1 > Better implementation for the gamma distribution density function > ----------------------------------------------------------------- > > Key: MATH-753 > URL: https://issues.apache.org/jira/browse/MATH-753 > Project: Commons Math > Issue Type: Improvement > Affects Versions: 2.2, 3.0, 3.1 > Reporter: Francesco Strino > Priority: Minor > Labels: improvement, stability > Fix For: 3.1 > > > The way the density of the gamma distribution function is estimated can be improved. > It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. > It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: > return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); > In order to improve performance, log(beta) and Gamma.logGamma(alpha) could also be precomputed and stored during initialization. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
In reply to this post by ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/MATH-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216716#comment-13216716 ] Gilles commented on MATH-753: ----------------------------- As per my message on the "dev" ML, if you find out how to correctly fix this, no problem to apply it. > Better implementation for the gamma distribution density function > ----------------------------------------------------------------- > > Key: MATH-753 > URL: https://issues.apache.org/jira/browse/MATH-753 > Project: Commons Math > Issue Type: Improvement > Affects Versions: 2.2, 3.0, 3.1 > Reporter: Francesco Strino > Priority: Minor > Labels: improvement, stability > Fix For: 3.1 > > > The way the density of the gamma distribution function is estimated can be improved. > It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. > It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: > return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); > In order to improve performance, log(beta) and Gamma.logGamma(alpha) could also be precomputed and stored during initialization. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
In reply to this post by ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/MATH-753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sébastien Brisard updated MATH-753: ----------------------------------- Attachment: lanczos.patch {{lanczos.patch}} is the patch discussed in the the {{dev}} mailing-list: {quote} As I initially feared, what was proposed in the JIRA ticket leads to high floating point errors. I adapted a method proposed in [BOOST|http://www.boost.org/doc/libs/1_35_0/libs/math/doc/sf_and_dist/html/math_toolkit/special/sf_gamma/igamma.html] (formula (15)) with acceptable errors. Meanwhile, I've also managed to improve the accuracy of the computation of the density for the range of parameters where the previous implementation was already working: in this range, the accuracy _was_ about 300 ulps, and is now 1-2 ulps! {color:red}Note: I might have been too optimistic, here. There is a significant improvement, though{color}. I think this improvement is worth implementing. The downside is that I need to expose the Lanczos implementation internally used by {{o.a.c.m3.special.Gamma}}. This approximation is so standard that I don't see it as a problem. I don't think that it reveals too much of the Gamma internals, since the javadoc of {{Gamma.logGamma}} states that it uses this approximation. So what I propose is the addition of two methods in {{Gamma}}: * {{double getLanczosG()}}: returns the {{g}} constant * {{double lanczos(double)}}: returns the value of the Lanczos sum. {quote} > Better implementation for the gamma distribution density function > ----------------------------------------------------------------- > > Key: MATH-753 > URL: https://issues.apache.org/jira/browse/MATH-753 > Project: Commons Math > Issue Type: Improvement > Affects Versions: 2.2, 3.0, 3.1 > Reporter: Francesco Strino > Assignee: Sébastien Brisard > Priority: Minor > Labels: improvement, stability > Fix For: 3.1 > > Attachments: lanczos.patch > > > The way the density of the gamma distribution function is estimated can be improved. > It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. > It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: > return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); > In order to improve performance, log(beta) and Gamma.logGamma(alpha) could also be precomputed and stored during initialization. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
In reply to this post by ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/MATH-753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sébastien Brisard updated MATH-753: ----------------------------------- Attachment: (was: lanczos.patch) > Better implementation for the gamma distribution density function > ----------------------------------------------------------------- > > Key: MATH-753 > URL: https://issues.apache.org/jira/browse/MATH-753 > Project: Commons Math > Issue Type: Improvement > Affects Versions: 2.2, 3.0, 3.1 > Reporter: Francesco Strino > Assignee: Sébastien Brisard > Priority: Minor > Labels: improvement, stability > Fix For: 3.1 > > > The way the density of the gamma distribution function is estimated can be improved. > It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. > It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: > return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); > In order to improve performance, log(beta) and Gamma.logGamma(alpha) could also be precomputed and stored during initialization. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
In reply to this post by ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/MATH-753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sébastien Brisard updated MATH-753: ----------------------------------- Attachment: lanczos.patch Patch against {{r1336059}}. > Better implementation for the gamma distribution density function > ----------------------------------------------------------------- > > Key: MATH-753 > URL: https://issues.apache.org/jira/browse/MATH-753 > Project: Commons Math > Issue Type: Improvement > Affects Versions: 2.2, 3.0, 3.1 > Reporter: Francesco Strino > Assignee: Sébastien Brisard > Priority: Minor > Labels: improvement, stability > Fix For: 3.1 > > Attachments: lanczos.patch > > > The way the density of the gamma distribution function is estimated can be improved. > It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. > It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: > return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); > In order to improve performance, log(beta) and Gamma.logGamma(alpha) could also be precomputed and stored during initialization. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
In reply to this post by ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/MATH-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271320#comment-13271320 ] Gilles commented on MATH-753: ----------------------------- I noticed that the code contains this line: {noformat} private static final double DEFAULT_EPSILON = 10e-15; {noformat} This is an unusual way to write 1e-14. Or, more probably, 1e-15 was intended. > Better implementation for the gamma distribution density function > ----------------------------------------------------------------- > > Key: MATH-753 > URL: https://issues.apache.org/jira/browse/MATH-753 > Project: Commons Math > Issue Type: Improvement > Affects Versions: 2.2, 3.0, 3.1 > Reporter: Francesco Strino > Assignee: Sébastien Brisard > Priority: Minor > Labels: improvement, stability > Fix For: 3.1 > > Attachments: lanczos.patch > > > The way the density of the gamma distribution function is estimated can be improved. > It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. > It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: > return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); > In order to improve performance, log(beta) and Gamma.logGamma(alpha) could also be precomputed and stored during initialization. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
In reply to this post by ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/MATH-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271328#comment-13271328 ] Sébastien Brisard commented on MATH-753: ---------------------------------------- Well spotted! I will investigate... > Better implementation for the gamma distribution density function > ----------------------------------------------------------------- > > Key: MATH-753 > URL: https://issues.apache.org/jira/browse/MATH-753 > Project: Commons Math > Issue Type: Improvement > Affects Versions: 2.2, 3.0, 3.1 > Reporter: Francesco Strino > Assignee: Sébastien Brisard > Priority: Minor > Labels: improvement, stability > Fix For: 3.1 > > Attachments: lanczos.patch > > > The way the density of the gamma distribution function is estimated can be improved. > It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. > It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: > return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); > In order to improve performance, log(beta) and Gamma.logGamma(alpha) could also be precomputed and stored during initialization. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
In reply to this post by ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/MATH-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272089#comment-13272089 ] Sébastien Brisard commented on MATH-753: ---------------------------------------- Patch {{lanczos.patch}} (09/May/12 10:38) committed in {{r1336483}}. > Better implementation for the gamma distribution density function > ----------------------------------------------------------------- > > Key: MATH-753 > URL: https://issues.apache.org/jira/browse/MATH-753 > Project: Commons Math > Issue Type: Improvement > Affects Versions: 2.2, 3.0, 3.1 > Reporter: Francesco Strino > Assignee: Sébastien Brisard > Priority: Minor > Labels: improvement, stability > Fix For: 3.1 > > Attachments: lanczos.patch > > > The way the density of the gamma distribution function is estimated can be improved. > It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. > It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: > return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); > In order to improve performance, log(beta) and Gamma.logGamma(alpha) could also be precomputed and stored during initialization. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
In reply to this post by ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/MATH-753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sébastien Brisard updated MATH-753: ----------------------------------- Attachment: MATH-753.tar.gz The attached file {{MATH-753.tar.gz}} compares the accuracy of four different calculation methods for the density. * method 1 is the currently implemented method (leading to overflows) * method 2 is the method proposed by Francesco (exp(log(...)) * method 3 is based on BOOST * method 4 is the recommended method, partly based on method 3. The provided files check the accuracy of the methods for several values of the shape parameter (scale is not really relevant to this problem: used 1). Reference values where computed with Maxima (with a precision of 64 digits, for exactly representable double values of the argument. The report shows statistics on the error in ulps. For each method, two statistics are computed * the first one corresponds to the case where the currently implemented method does not lead to overflow. Even in this case, the proposed method (4) achieves better accuracy * the second one corresponds to the case where the currently implemented method does lead to overflow. In this case, method 4 is superior to method 2. As a conclusion, I suggest we implement method 4. Here is the output of the test {noformat} ---------- x not leading to overflow, shape = 1.0 ---------- Initial method SummaryStatistics: n: 3200 min: 2.0 max: 4.0 mean: 2.8724999999999934 geometric mean: 2.784676841906384 variance: 0.4982744607689899 sum of squares: 27998.0 standard deviation: 0.7058855861745513 sum of logs: 3277.2218605584508 Method proposed by Francesco SummaryStatistics: n: 3200 min: 0.0 max: 67.0 mean: 12.791562499999992 geometric mean: 0.0 variance: 148.30446145279777 sum of squares: 998023.0 standard deviation: 12.17803192033909 sum of logs: -Infinity Proposed method, based on BOOST SummaryStatistics: n: 3200 min: 1.0 max: 2.0 mean: 1.4162499999999971 geometric mean: 1.3344543929682238 variance: 0.24306189434198203 sum of squares: 7196.0 standard deviation: 0.49301307725250254 sum of logs: 923.2720445058158 ---------- x not leading to overflow, shape = 10.0 ---------- Initial method SummaryStatistics: n: 400 min: 1.0 max: 6.0 mean: 3.14 geometric mean: 3.0020752311936034 variance: 0.8324812030075192 sum of squares: 4276.0 standard deviation: 0.9124040787981601 sum of logs: 439.72151730195765 Method proposed by Francesco SummaryStatistics: n: 400 min: 0.0 max: 70.0 mean: 14.739999999999998 geometric mean: 0.0 variance: 144.37834586466164 sum of squares: 144514.0 standard deviation: 12.015754069747834 sum of logs: -Infinity Proposed method, based on BOOST SummaryStatistics: n: 400 min: 0.0 max: 3.0 mean: 0.6075000000000004 geometric mean: 0.0 variance: 0.3593421052631577 sum of squares: 291.0 standard deviation: 0.5994515036791197 sum of logs: -Infinity ---------- x not leading to overflow, shape = 100.0 ---------- Initial method SummaryStatistics: n: 393 min: 4.0 max: 12.0 mean: 7.806615776081426 geometric mean: 7.619643567166563 variance: 2.916588772913744 sum of squares: 25094.0 standard deviation: 1.7078023225519234 sum of logs: 798.076729908358 Method proposed by Francesco SummaryStatistics: n: 393 min: 2.0 max: 824.0 mean: 164.59796437659037 geometric mean: 106.88809881158063 variance: 18545.179791764032 sum of squares: 1.7917059E7 standard deviation: 136.18068802794335 sum of logs: 1836.0105153185225 Proposed method, based on BOOST SummaryStatistics: n: 393 min: 0.0 max: 4.0 mean: 1.1679389312977109 geometric mean: 0.0 variance: 0.5176429350366102 sum of squares: 739.0 standard deviation: 0.7194740683559139 sum of logs: -Infinity ---------- x not leading to overflow, shape = 142.0 ---------- Initial method SummaryStatistics: n: 614 min: 0.0 max: 551.0 mean: 400.01140065146564 geometric mean: 0.0 variance: 7323.946036207893 sum of squares: 1.02735179E8 standard deviation: 85.5800562993966 sum of logs: -Infinity Method proposed by Francesco SummaryStatistics: n: 614 min: 0.0 max: 1421.0 mean: 408.09609120521105 geometric mean: 0.0 variance: 64951.369217975334 sum of squares: 1.42072235E8 standard deviation: 254.8555850240982 sum of logs: -Infinity Proposed method, based on BOOST SummaryStatistics: n: 614 min: 0.0 max: 90.0 mean: 0.8387622149837135 geometric mean: 0.0 variance: 21.84182293520944 sum of squares: 13821.0 standard deviation: 4.6735236102120465 sum of logs: -Infinity ---------- x leading to overflow, shape = 142.0 ---------- Method proposed by Francesco SummaryStatistics: n: 986 min: 2.0 max: 1434.0 mean: 427.799188640973 geometric mean: 297.73157979362674 variance: 88404.8205475645 sum of squares: 2.67528724E8 standard deviation: 297.32948146385434 sum of logs: 5616.4456488656715 Proposed method, based on BOOST SummaryStatistics: n: 986 min: 0.0 max: 203.0 mean: 30.59837728194727 geometric mean: 0.0 variance: 962.2852359427928 sum of squares: 1871004.0 standard deviation: 31.020722685695006 sum of logs: -Infinity ---------- x not leading to overflow, shape = 1000.0 ---------- Initial method SummaryStatistics: n: 65 min: 0.0 max: 0.0 mean: 0.0 geometric mean: 0.0 variance: 0.0 sum of squares: 0.0 standard deviation: 0.0 sum of logs: -Infinity Method proposed by Francesco SummaryStatistics: n: 65 min: 0.0 max: 0.0 mean: 0.0 geometric mean: 0.0 variance: 0.0 sum of squares: 0.0 standard deviation: 0.0 sum of logs: -Infinity Proposed method, based on BOOST SummaryStatistics: n: 65 min: 0.0 max: 0.0 mean: 0.0 geometric mean: 0.0 variance: 0.0 sum of squares: 0.0 standard deviation: 0.0 sum of logs: -Infinity ---------- x leading to overflow, shape = 1000.0 ---------- Method proposed by Francesco SummaryStatistics: n: 3252 min: 0.0 max: 11647.0 mean: 2645.6626691266915 geometric mean: 0.0 variance: 6222262.112872347 sum of squares: 4.2991048807E10 standard deviation: 2494.4462537549985 sum of logs: -Infinity Proposed method, based on BOOST SummaryStatistics: n: 3252 min: 0.0 max: 1966.0 mean: 150.69772447724463 geometric mean: 0.0 variance: 47598.90552542646 sum of squares: 2.28596325E8 standard deviation: 218.17173402030443 sum of logs: -Infinity {noformat} > Better implementation for the gamma distribution density function > ----------------------------------------------------------------- > > Key: MATH-753 > URL: https://issues.apache.org/jira/browse/MATH-753 > Project: Commons Math > Issue Type: Improvement > Affects Versions: 2.2, 3.0, 3.1 > Reporter: Francesco Strino > Assignee: Sébastien Brisard > Priority: Minor > Labels: improvement, stability > Fix For: 3.1 > > Attachments: MATH-753.tar.gz, lanczos.patch > > > The way the density of the gamma distribution function is estimated can be improved. > It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. > It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: > return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); > In order to improve performance, log(beta) and Gamma.logGamma(alpha) could also be precomputed and stored during initialization. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
In reply to this post by ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/MATH-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273053#comment-13273053 ] Sébastien Brisard edited comment on MATH-753 at 5/11/12 5:58 AM: ----------------------------------------------------------------- The attached file {{MATH-753.tar.gz}} compares the accuracy of four different calculation methods for the density. * method 1 is the currently implemented method (leading to overflows) * method 2 is the method proposed by Francesco (exp(log(...)) * method 3 is based on BOOST * method 4 is the recommended method, partly based on method 3. The provided files check the accuracy of the methods for several values of the shape parameter (scale is not really relevant to this problem: used 1). Reference values where computed with Maxima (with a precision of 64 digits, for exactly representable double values of the argument. The report shows statistics on the error in ulps. For each method, two statistics are computed * the first one corresponds to the case where the currently implemented method does not lead to overflow. Even in this case, the proposed method (4) achieves better accuracy * the second one corresponds to the case where the currently implemented method does lead to overflow. In this case, method 4 is superior to method 2. Regardless of the method, the accuracy is worse than in the first case; this was expected. As a conclusion, I suggest we implement method 4. Here is the output of the test {noformat} ---------- x not leading to overflow, shape = 1.0 ---------- Initial method SummaryStatistics: n: 3200 min: 2.0 max: 4.0 mean: 2.8724999999999934 geometric mean: 2.784676841906384 variance: 0.4982744607689899 sum of squares: 27998.0 standard deviation: 0.7058855861745513 sum of logs: 3277.2218605584508 Method proposed by Francesco SummaryStatistics: n: 3200 min: 0.0 max: 67.0 mean: 12.791562499999992 geometric mean: 0.0 variance: 148.30446145279777 sum of squares: 998023.0 standard deviation: 12.17803192033909 sum of logs: -Infinity Proposed method, based on BOOST SummaryStatistics: n: 3200 min: 1.0 max: 2.0 mean: 1.4162499999999971 geometric mean: 1.3344543929682238 variance: 0.24306189434198203 sum of squares: 7196.0 standard deviation: 0.49301307725250254 sum of logs: 923.2720445058158 ---------- x not leading to overflow, shape = 10.0 ---------- Initial method SummaryStatistics: n: 400 min: 1.0 max: 6.0 mean: 3.14 geometric mean: 3.0020752311936034 variance: 0.8324812030075192 sum of squares: 4276.0 standard deviation: 0.9124040787981601 sum of logs: 439.72151730195765 Method proposed by Francesco SummaryStatistics: n: 400 min: 0.0 max: 70.0 mean: 14.739999999999998 geometric mean: 0.0 variance: 144.37834586466164 sum of squares: 144514.0 standard deviation: 12.015754069747834 sum of logs: -Infinity Proposed method, based on BOOST SummaryStatistics: n: 400 min: 0.0 max: 3.0 mean: 0.6075000000000004 geometric mean: 0.0 variance: 0.3593421052631577 sum of squares: 291.0 standard deviation: 0.5994515036791197 sum of logs: -Infinity ---------- x not leading to overflow, shape = 100.0 ---------- Initial method SummaryStatistics: n: 393 min: 4.0 max: 12.0 mean: 7.806615776081426 geometric mean: 7.619643567166563 variance: 2.916588772913744 sum of squares: 25094.0 standard deviation: 1.7078023225519234 sum of logs: 798.076729908358 Method proposed by Francesco SummaryStatistics: n: 393 min: 2.0 max: 824.0 mean: 164.59796437659037 geometric mean: 106.88809881158063 variance: 18545.179791764032 sum of squares: 1.7917059E7 standard deviation: 136.18068802794335 sum of logs: 1836.0105153185225 Proposed method, based on BOOST SummaryStatistics: n: 393 min: 0.0 max: 4.0 mean: 1.1679389312977109 geometric mean: 0.0 variance: 0.5176429350366102 sum of squares: 739.0 standard deviation: 0.7194740683559139 sum of logs: -Infinity ---------- x not leading to overflow, shape = 142.0 ---------- Initial method SummaryStatistics: n: 614 min: 0.0 max: 551.0 mean: 400.01140065146564 geometric mean: 0.0 variance: 7323.946036207893 sum of squares: 1.02735179E8 standard deviation: 85.5800562993966 sum of logs: -Infinity Method proposed by Francesco SummaryStatistics: n: 614 min: 0.0 max: 1421.0 mean: 408.09609120521105 geometric mean: 0.0 variance: 64951.369217975334 sum of squares: 1.42072235E8 standard deviation: 254.8555850240982 sum of logs: -Infinity Proposed method, based on BOOST SummaryStatistics: n: 614 min: 0.0 max: 90.0 mean: 0.8387622149837135 geometric mean: 0.0 variance: 21.84182293520944 sum of squares: 13821.0 standard deviation: 4.6735236102120465 sum of logs: -Infinity ---------- x leading to overflow, shape = 142.0 ---------- Method proposed by Francesco SummaryStatistics: n: 986 min: 2.0 max: 1434.0 mean: 427.799188640973 geometric mean: 297.73157979362674 variance: 88404.8205475645 sum of squares: 2.67528724E8 standard deviation: 297.32948146385434 sum of logs: 5616.4456488656715 Proposed method, based on BOOST SummaryStatistics: n: 986 min: 0.0 max: 203.0 mean: 30.59837728194727 geometric mean: 0.0 variance: 962.2852359427928 sum of squares: 1871004.0 standard deviation: 31.020722685695006 sum of logs: -Infinity ---------- x not leading to overflow, shape = 1000.0 ---------- Initial method SummaryStatistics: n: 65 min: 0.0 max: 0.0 mean: 0.0 geometric mean: 0.0 variance: 0.0 sum of squares: 0.0 standard deviation: 0.0 sum of logs: -Infinity Method proposed by Francesco SummaryStatistics: n: 65 min: 0.0 max: 0.0 mean: 0.0 geometric mean: 0.0 variance: 0.0 sum of squares: 0.0 standard deviation: 0.0 sum of logs: -Infinity Proposed method, based on BOOST SummaryStatistics: n: 65 min: 0.0 max: 0.0 mean: 0.0 geometric mean: 0.0 variance: 0.0 sum of squares: 0.0 standard deviation: 0.0 sum of logs: -Infinity ---------- x leading to overflow, shape = 1000.0 ---------- Method proposed by Francesco SummaryStatistics: n: 3252 min: 0.0 max: 11647.0 mean: 2645.6626691266915 geometric mean: 0.0 variance: 6222262.112872347 sum of squares: 4.2991048807E10 standard deviation: 2494.4462537549985 sum of logs: -Infinity Proposed method, based on BOOST SummaryStatistics: n: 3252 min: 0.0 max: 1966.0 mean: 150.69772447724463 geometric mean: 0.0 variance: 47598.90552542646 sum of squares: 2.28596325E8 standard deviation: 218.17173402030443 sum of logs: -Infinity {noformat} was (Author: celestin): The attached file {{MATH-753.tar.gz}} compares the accuracy of four different calculation methods for the density. * method 1 is the currently implemented method (leading to overflows) * method 2 is the method proposed by Francesco (exp(log(...)) * method 3 is based on BOOST * method 4 is the recommended method, partly based on method 3. The provided files check the accuracy of the methods for several values of the shape parameter (scale is not really relevant to this problem: used 1). Reference values where computed with Maxima (with a precision of 64 digits, for exactly representable double values of the argument. The report shows statistics on the error in ulps. For each method, two statistics are computed * the first one corresponds to the case where the currently implemented method does not lead to overflow. Even in this case, the proposed method (4) achieves better accuracy * the second one corresponds to the case where the currently implemented method does lead to overflow. In this case, method 4 is superior to method 2. As a conclusion, I suggest we implement method 4. Here is the output of the test {noformat} ---------- x not leading to overflow, shape = 1.0 ---------- Initial method SummaryStatistics: n: 3200 min: 2.0 max: 4.0 mean: 2.8724999999999934 geometric mean: 2.784676841906384 variance: 0.4982744607689899 sum of squares: 27998.0 standard deviation: 0.7058855861745513 sum of logs: 3277.2218605584508 Method proposed by Francesco SummaryStatistics: n: 3200 min: 0.0 max: 67.0 mean: 12.791562499999992 geometric mean: 0.0 variance: 148.30446145279777 sum of squares: 998023.0 standard deviation: 12.17803192033909 sum of logs: -Infinity Proposed method, based on BOOST SummaryStatistics: n: 3200 min: 1.0 max: 2.0 mean: 1.4162499999999971 geometric mean: 1.3344543929682238 variance: 0.24306189434198203 sum of squares: 7196.0 standard deviation: 0.49301307725250254 sum of logs: 923.2720445058158 ---------- x not leading to overflow, shape = 10.0 ---------- Initial method SummaryStatistics: n: 400 min: 1.0 max: 6.0 mean: 3.14 geometric mean: 3.0020752311936034 variance: 0.8324812030075192 sum of squares: 4276.0 standard deviation: 0.9124040787981601 sum of logs: 439.72151730195765 Method proposed by Francesco SummaryStatistics: n: 400 min: 0.0 max: 70.0 mean: 14.739999999999998 geometric mean: 0.0 variance: 144.37834586466164 sum of squares: 144514.0 standard deviation: 12.015754069747834 sum of logs: -Infinity Proposed method, based on BOOST SummaryStatistics: n: 400 min: 0.0 max: 3.0 mean: 0.6075000000000004 geometric mean: 0.0 variance: 0.3593421052631577 sum of squares: 291.0 standard deviation: 0.5994515036791197 sum of logs: -Infinity ---------- x not leading to overflow, shape = 100.0 ---------- Initial method SummaryStatistics: n: 393 min: 4.0 max: 12.0 mean: 7.806615776081426 geometric mean: 7.619643567166563 variance: 2.916588772913744 sum of squares: 25094.0 standard deviation: 1.7078023225519234 sum of logs: 798.076729908358 Method proposed by Francesco SummaryStatistics: n: 393 min: 2.0 max: 824.0 mean: 164.59796437659037 geometric mean: 106.88809881158063 variance: 18545.179791764032 sum of squares: 1.7917059E7 standard deviation: 136.18068802794335 sum of logs: 1836.0105153185225 Proposed method, based on BOOST SummaryStatistics: n: 393 min: 0.0 max: 4.0 mean: 1.1679389312977109 geometric mean: 0.0 variance: 0.5176429350366102 sum of squares: 739.0 standard deviation: 0.7194740683559139 sum of logs: -Infinity ---------- x not leading to overflow, shape = 142.0 ---------- Initial method SummaryStatistics: n: 614 min: 0.0 max: 551.0 mean: 400.01140065146564 geometric mean: 0.0 variance: 7323.946036207893 sum of squares: 1.02735179E8 standard deviation: 85.5800562993966 sum of logs: -Infinity Method proposed by Francesco SummaryStatistics: n: 614 min: 0.0 max: 1421.0 mean: 408.09609120521105 geometric mean: 0.0 variance: 64951.369217975334 sum of squares: 1.42072235E8 standard deviation: 254.8555850240982 sum of logs: -Infinity Proposed method, based on BOOST SummaryStatistics: n: 614 min: 0.0 max: 90.0 mean: 0.8387622149837135 geometric mean: 0.0 variance: 21.84182293520944 sum of squares: 13821.0 standard deviation: 4.6735236102120465 sum of logs: -Infinity ---------- x leading to overflow, shape = 142.0 ---------- Method proposed by Francesco SummaryStatistics: n: 986 min: 2.0 max: 1434.0 mean: 427.799188640973 geometric mean: 297.73157979362674 variance: 88404.8205475645 sum of squares: 2.67528724E8 standard deviation: 297.32948146385434 sum of logs: 5616.4456488656715 Proposed method, based on BOOST SummaryStatistics: n: 986 min: 0.0 max: 203.0 mean: 30.59837728194727 geometric mean: 0.0 variance: 962.2852359427928 sum of squares: 1871004.0 standard deviation: 31.020722685695006 sum of logs: -Infinity ---------- x not leading to overflow, shape = 1000.0 ---------- Initial method SummaryStatistics: n: 65 min: 0.0 max: 0.0 mean: 0.0 geometric mean: 0.0 variance: 0.0 sum of squares: 0.0 standard deviation: 0.0 sum of logs: -Infinity Method proposed by Francesco SummaryStatistics: n: 65 min: 0.0 max: 0.0 mean: 0.0 geometric mean: 0.0 variance: 0.0 sum of squares: 0.0 standard deviation: 0.0 sum of logs: -Infinity Proposed method, based on BOOST SummaryStatistics: n: 65 min: 0.0 max: 0.0 mean: 0.0 geometric mean: 0.0 variance: 0.0 sum of squares: 0.0 standard deviation: 0.0 sum of logs: -Infinity ---------- x leading to overflow, shape = 1000.0 ---------- Method proposed by Francesco SummaryStatistics: n: 3252 min: 0.0 max: 11647.0 mean: 2645.6626691266915 geometric mean: 0.0 variance: 6222262.112872347 sum of squares: 4.2991048807E10 standard deviation: 2494.4462537549985 sum of logs: -Infinity Proposed method, based on BOOST SummaryStatistics: n: 3252 min: 0.0 max: 1966.0 mean: 150.69772447724463 geometric mean: 0.0 variance: 47598.90552542646 sum of squares: 2.28596325E8 standard deviation: 218.17173402030443 sum of logs: -Infinity {noformat} > Better implementation for the gamma distribution density function > ----------------------------------------------------------------- > > Key: MATH-753 > URL: https://issues.apache.org/jira/browse/MATH-753 > Project: Commons Math > Issue Type: Improvement > Affects Versions: 2.2, 3.0, 3.1 > Reporter: Francesco Strino > Assignee: Sébastien Brisard > Priority: Minor > Labels: improvement, stability > Fix For: 3.1 > > Attachments: MATH-753.tar.gz, lanczos.patch > > > The way the density of the gamma distribution function is estimated can be improved. > It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. > It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: > return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); > In order to improve performance, log(beta) and Gamma.logGamma(alpha) could also be precomputed and stored during initialization. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
In reply to this post by ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/MATH-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276417#comment-13276417 ] Sébastien Brisard commented on MATH-753: ---------------------------------------- Reference data for unit tests in {{r1338986}}. > Better implementation for the gamma distribution density function > ----------------------------------------------------------------- > > Key: MATH-753 > URL: https://issues.apache.org/jira/browse/MATH-753 > Project: Commons Math > Issue Type: Improvement > Affects Versions: 2.2, 3.0, 3.1 > Reporter: Francesco Strino > Assignee: Sébastien Brisard > Priority: Minor > Labels: improvement, stability > Fix For: 3.1 > > Attachments: MATH-753.tar.gz, lanczos.patch > > > The way the density of the gamma distribution function is estimated can be improved. > It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. > It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: > return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); > In order to improve performance, log(beta) and Gamma.logGamma(alpha) could also be precomputed and stored during initialization. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
In reply to this post by ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/MATH-753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sébastien Brisard resolved MATH-753. ------------------------------------ Resolution: Fixed > Better implementation for the gamma distribution density function > ----------------------------------------------------------------- > > Key: MATH-753 > URL: https://issues.apache.org/jira/browse/MATH-753 > Project: Commons Math > Issue Type: Improvement > Affects Versions: 2.2, 3.0, 3.1 > Reporter: Francesco Strino > Assignee: Sébastien Brisard > Priority: Minor > Labels: improvement, stability > Fix For: 3.1 > > Attachments: MATH-753.tar.gz, lanczos.patch > > > The way the density of the gamma distribution function is estimated can be improved. > It's much more stable to calculate first the log of the density and then exponentiate, otherwise the function returns NaN for high values of the parameters alpha and beta. > It would be sufficient to change the public double density(double x) function at line 204 in the file org.apache.commons.math.distribution.GammaDistributionImpl as follows: > return Math.exp(Math.log( x )*(alpha-1) - Math.log(beta)*alpha - x/beta - Gamma.logGamma(alpha)); > In order to improve performance, log(beta) and Gamma.logGamma(alpha) could also be precomputed and stored during initialization. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
Free forum by Nabble | Edit this page |