DBinom


Software for evaluating probability functions and densities

Computing Probability Functions

The binomial probability is
bin(x,n,p) =       n!       px (1-p)n-x.
x!(n-x)!
Direct evaluation of this expression is reasonable for small n. For larger n, multiplying out the factorials will lead to overflow problems.

Computer packages, including S-Plus and the statistics toolbox in Matlab, evaluate the probabilities using the representation

log( bin(x,n,p) ) = log(n!) - log(x!) - log((n-x)!) + x × log(p) + (n-x) × log(1-p),
and use a log-gamma function to evaluate the log-factorials.

This is better than direct evaluation, but loss-of-precision can still occur, typically becoming noticeable for n about 106. To overcome this limitation, I developed an alternative set of algorithms, based on saddle point methods and Stirling's formula, that significantly improve accuracy in large samples.

Similar problems occur for many common distributions. My DBinom library provides a stable implementation for the Binomial, Poisson, Hypergeometric, Gamma (incl. Chi-Square), Student's t, F, negative Binomial and Beta distributions.

The R programming environment incorporates a version of this code. DBinom example

Example

The graph on the right shows an example, computing the binomial probability bin(3,n,2/n) for a range of values of n. As n increases to infinity, this probability should converge to the Poisson probability, p0 = 4e-2/3. The plot shows the number of digits agreement (measured by log10|p/p0-1|) for a range of n.

The saddle point algorithm (indicated by +) obtains the full accuracy of computer double precision (about 16 digits) for large n. The `classic' algorithm obtains similar results for n up to 107, then loses precision after that.