PRDOUBLEPOISSON procedure

Calculates the probability density for the double Poisson distribution (V.M. Cave).

Options

`PRINT` = string tokens	Controls printed output (`probability`, `summary`); default `prob`
`PLOT` = string token	Whether to plot the k terms used to approximate the normalizing constant by the `kpartialsum` method (`yes`, `no`); default `no`
`METHOD` = string token	How to approximate the normalizing constant (`kpartialsum`, `edgeworth`); default `kpar`
`LOCATION` = scalar or variate	Location parameter; no default, must be set
`SHAPE` = scalar or variate	Shape parameter; default 1
`MAXCYCLE` = scalar or variate	Limits the number of terms, k, used to approximate the normalizing constant by the `kpartialsum` method; default `MAX`(`1000, 2*LOCATION`)
`TOLERANCE` = scalar	Convergence criterion used when approximating the normalizing constant by the `kpartialsum` method; default `1E-12`

Parameters

`DATA` = scalar or variate	Non-negative integer values for which the double Poisson probabilities are to be calculated
`DECIMALS` = scalars	Number of decimal places for printing; default `*`
`PROBABILITY` = variate	Saves the probabilities

Description

PRDOUBLEPOISSON calculates the probability density for the two-parameter double Poisson distribution. The double Poisson probability density is given by

P(X=x) = c(μ,θ) θ^1/2 e^-θμ ( e^-x x^x / x!) (e μ / x) ^θxfor μ > 0, θ > 0, x = 0, 1, 2 …

where c(μ,θ) is the normalizing constant, μ is the location parameter, and θ is the shape parameter. The double Poisson distribution is over-dispersed when θ > 1, under-dispersed when 0< θ < 1, and is identical to the Poisson distribution when θ =1.

The non-negative integers, for which the double Poisson probabilities are to be calculated, are supplied by the DATA parameter.

The location parameter, μ, must be specified using the LOCATION option. The shape parameter, θ, can be set using the SHAPE option; default 1. For both the LOCATION and SHAPE options, either a single value (scalar or variate of length 1) or a variate containing the same number of values as DATA may be supplied.

The METHOD option specifies the method used to approximate the normalizing constant. The default (METHOD=kpartialsum) is to use the more accurate and reliable k-th partial sum method proposed by Zou et al. (2013). This method involves summing the first k terms of an infinite sum (see the Method section). The number of terms, k, is determined by the TOLERANCE and MAXCYCLE options. The TOLERANCE option can supply a scalar to specify the tolerance for convergence of the infinite sum; default 1E-12. The MAXCYCLE option places a limit on k, where the default is the maximum of 1000 and twice the value of the location parameter. If the infinite sum fails to converge within k = MAXCYCLE, the probability density is not calculated and a warning is given. MAXCYCLE may supply either a single value (scalar or variate of length 1) or a variate containing the same number of values as DATA. (However, if both LOCATION and SHAPE supply single values, only the first value of MAXCYCLE is used.) The PLOT option allows you to request a plot of the k terms used to approximate the normalizing constant. By default no plot is produced.

Although the k-th partial sum method converges very quickly when the location parameter is small, convergence for large values of the location parameter requires a large value for k. The closed-form Edgeworth series method of Efron (1986) may then provide an alternative way of approximating the normalizing constant (METHOD=edgeworth). However, the Edgeworth series approximation is highly unreliable for small values of the location parameter, i.e. values less than about 10.

Printed output is controlled by the PRINT option, with settings:

`probability` (the default)	prints the probability density, and
`summary`	prints a description and a table containing; the data value, the location and shape parameters, the approximation of the normalizing constant, k (if `METHOD=kpartialsum`), and the probability density.

The DECIMALS parameter allows you to set the number of decimal places to appear in the printed output.

The PROBABILITY parameter can save the probability densities, in a variate.

Options: PRINT, PLOT, METHOD, LOCATION, SHAPE, MAXCYCLE, TOLERANCE.
Parameters: DATA, DECIMALS, PROBABILITY.

Method

The normalizing constant for the double Poisson distribution, c(μ,θ), is given by an infinite sum. PRDOUBLEPOISSON offers two methods for approximating the constant: the k-th partial sum method of Zou et al. (2013), i.e. METHOD=kpartialsum, and the Edgeworth series method of Efron (1986), i.e. METHOD=edgeworth.

The k-th partial sum method uses the sum of the first k terms of the infinite sum. The number of terms is determined by the TOLERANCE option, which specifies the convergence criterion. The infinite sum is assumed to have converged when

f_μ,θ (X = k-1) > f_μ,θ (X = k)

and

f_μ,θ (X = k) < TOLERANCE.

If the infinite sum fails to converge within k = MAXCYCLE, the probability density is not calculated and a warning is given.

The Edgeworth series method of Efron (1986) provides a closed-form approximation to the infinite sum.

1 / c(μ,θ) = ∑_{x = 0…∞} ^f_{μ, θ} (x)

The k-th partial sum method is more accurate and more reliable than the Edgeworth series approximation. In particular, the Edgeworth series method is highly unreliable when the location parameter, θ, is small (i.e. less than about 10), and may even produce negative values.

Action with `RESTRICT`

The DATA, LOCATION, SHAPE and MAXCYCLE variates can be restricted.

References

Efron, B. (1986). Double exponential families and their use in generalized linear regression. Journal of the American Statistical Association, 81, 709-721.

Zou, Y., Geedipally, S.R. & Lord, D. (2013). Evaluating the double Poisson generalized linear model. Accident Analysis & Prevention, 59, 497-505.

Example

CAPTION 'PRDOUBLEPOISSON example'; STYLE=meta

PRDOUBLEPOISSON [PRINT=summary; PLOT=yes; LOCATION=10; SHAPE=0.5] !(0...24)
PRDOUBLEPOISSON [PRINT=summary; METHOD=edgeworth; LOCATION=10; SHAPE=0.5] !(0...24)

PRDOUBLEPOISSON [PRINT=summary; PLOT=yes; LOCATION=!(1...10); SHAPE=0.5] !(10(2))
PRDOUBLEPOISSON [PRINT=summary; PLOT=yes; LOCATION=10; SHAPE=!(1...10)/10] !(10(2))

Updated on September 12, 2019

Was this article helpful?

Yes No