Calculates the probability density for the double Poisson distribution (V.M. Cave).
Options
PRINT = string tokens |
Controls printed output (probability , summary ); default prob |
---|---|
PLOT = string token |
Whether to plot the k terms used to approximate the normalizing constant by the kpartialsum method (yes , no ); default no |
METHOD = string token |
How to approximate the normalizing constant (kpartialsum , edgeworth ); default kpar |
LOCATION = scalar or variate |
Location parameter; no default, must be set |
SHAPE = scalar or variate |
Shape parameter; default 1 |
MAXCYCLE = scalar or variate |
Limits the number of terms, k, used to approximate the normalizing constant by the kpartialsum method; default MAX (1000, 2*LOCATION ) |
TOLERANCE = scalar |
Convergence criterion used when approximating the normalizing constant by the kpartialsum method; default 1E-12 |
Parameters
DATA = scalar or variate |
Non-negative integer values for which the double Poisson probabilities are to be calculated |
---|---|
DECIMALS = scalars |
Number of decimal places for printing; default * |
PROBABILITY = variate |
Saves the probabilities |
Description
PRDOUBLEPOISSON calculates the probability density for the two-parameter double Poisson distribution. The double Poisson probability density is given by
P(X=x) = c(μ,θ) θ1/2 e-θμ ( e-x xx / x!) (e μ / x) θx
for μ > 0, θ > 0, x = 0, 1, 2 …
where c(μ,θ) is the normalizing constant, μ is the location parameter, and θ is the shape parameter. The double Poisson distribution is over-dispersed when θ > 1, under-dispersed when 0< θ < 1, and is identical to the Poisson distribution when θ =1.
The non-negative integers, for which the double Poisson probabilities are to be calculated, are supplied by the DATA
parameter.
The location parameter, μ, must be specified using the LOCATION
option. The shape parameter, θ, can be set using the SHAPE
option; default 1. For both the LOCATION
and SHAPE
options, either a single value (scalar or variate of length 1) or a variate containing the same number of values as DATA
may be supplied.
The METHOD
option specifies the method used to approximate the normalizing constant. The default (METHOD=kpartialsum
) is to use the more accurate and reliable k-th partial sum method proposed by Zou et al. (2013). This method involves summing the first k terms of an infinite sum (see the Method section). The number of terms, k, is determined by the TOLERANCE
and MAXCYCLE
options. The TOLERANCE
option can supply a scalar to specify the tolerance for convergence of the infinite sum; default 1E-12
. The MAXCYCLE
option places a limit on k, where the default is the maximum of 1000 and twice the value of the location parameter. If the infinite sum fails to converge within k = MAXCYCLE
, the probability density is not calculated and a warning is given. MAXCYCLE
may supply either a single value (scalar or variate of length 1) or a variate containing the same number of values as DATA
. (However, if both LOCATION
and SHAPE
supply single values, only the first value of MAXCYCLE
is used.) The PLOT
option allows you to request a plot of the k terms used to approximate the normalizing constant. By default no plot is produced.
Although the k-th partial sum method converges very quickly when the location parameter is small, convergence for large values of the location parameter requires a large value for k. The closed-form Edgeworth series method of Efron (1986) may then provide an alternative way of approximating the normalizing constant (METHOD=edgeworth
). However, the Edgeworth series approximation is highly unreliable for small values of the location parameter, i.e. values less than about 10.
Printed output is controlled by the PRINT
option, with settings:
probability (the default) |
prints the probability density, and |
summary |
prints a description and a table containing; the data value, the location and shape parameters, the approximation of the normalizing constant, k (if METHOD=kpartialsum ), and the probability density. |
The DECIMALS
parameter allows you to set the number of decimal places to appear in the printed output.
The PROBABILITY
parameter can save the probability densities, in a variate.
Options: PRINT
, PLOT
, METHOD
, LOCATION
, SHAPE
, MAXCYCLE
, TOLERANCE
.
Parameters: DATA
, DECIMALS
, PROBABILITY
.
Method
The normalizing constant for the double Poisson distribution, c(μ,θ), is given by an infinite sum. PRDOUBLEPOISSON
offers two methods for approximating the constant: the k-th partial sum method of Zou et al. (2013), i.e. METHOD=kpartialsum
, and the Edgeworth series method of Efron (1986), i.e. METHOD=edgeworth
.
The k-th partial sum method uses the sum of the first k terms of the infinite sum. The number of terms is determined by the TOLERANCE
option, which specifies the convergence criterion. The infinite sum is assumed to have converged when
fμ,θ (X = k-1) > fμ,θ (X = k)
and
fμ,θ (X = k) < TOLERANCE
.
If the infinite sum fails to converge within k = MAXCYCL
E, the probability density is not calculated and a warning is given.
The Edgeworth series method of Efron (1986) provides a closed-form approximation to the infinite sum.
1 / c(μ,θ) = ∑x = 0…∞ fμ, θ (x)
The k-th partial sum method is more accurate and more reliable than the Edgeworth series approximation. In particular, the Edgeworth series method is highly unreliable when the location parameter, θ, is small (i.e. less than about 10), and may even produce negative values.
Action with RESTRICT
The DATA
, LOCATION
, SHAPE
and MAXCYCLE
variates can be restricted.
References
Efron, B. (1986). Double exponential families and their use in generalized linear regression. Journal of the American Statistical Association, 81, 709-721.
Zou, Y., Geedipally, S.R. & Lord, D. (2013). Evaluating the double Poisson generalized linear model. Accident Analysis & Prevention, 59, 497-505.
See also
Commands for: Basic and nonparametric statistics.
Example
CAPTION 'PRDOUBLEPOISSON example'; STYLE=meta PRDOUBLEPOISSON [PRINT=summary; PLOT=yes; LOCATION=10; SHAPE=0.5] !(0...24) PRDOUBLEPOISSON [PRINT=summary; METHOD=edgeworth; LOCATION=10; SHAPE=0.5] !(0...24) PRDOUBLEPOISSON [PRINT=summary; PLOT=yes; LOCATION=!(1...10); SHAPE=0.5] !(10(2)) PRDOUBLEPOISSON [PRINT=summary; PLOT=yes; LOCATION=10; SHAPE=!(1...10)/10] !(10(2))