Fits probit models allowing for natural mortality and immunity (R.W. Payne).

### Options

`PRINT` = string tokens |
Printed output required (`model` , `summary` , `estimates` , `correlations` , `fittedvalues` , `monitoring` , `effectivedoses` ); default `mode` , `summ` , `esti` , `fitt` , `effe` |
---|---|

`TRANSFORMATION` = string token |
Transformation to be used (`probit` , `logit` , `complementaryloglog` ); default `prob` |

`MORTALITY` = string token |
Whether to estimate natural mortality (`omit` , `estimate` ); default `omit` |

`IMMUNITY` = string token |
Whether to estimate natural immunity (`omit` , `estimate` ); default `omit` |

`GROUPS` = factor |
Defines groups for an analysis of parallelism; default `*` i.e. no groups |

`SEPARATE` = string tokens |
Which parameters (apart from intercept) should be estimated separately for different groups (`slope` , `mortality` , `immunity` , `notintercept` ); default `*` i.e. none |

`LD` = scalar or variate |
Effective, or lethal, doses to be estimated, other than 50 |

`CIPROBABILITY` = scalar |
Probability level for the confidence interval of effective doses; default 0.95, i.e. a 95% confidence interval |

`LOGBASE` = string token |
Base of antilog transformation to be applied to LD’s (`ten` , `e` ); default `*` i.e. none |

`DISPERSION` = scalar |
Controls the use of a heterogeneity factor in the calculation of s.e.s etc; with the default of 1 no factor is used, a missing value `*` estimates the heterogeneity from the residual deviance |

`FITMETHOD` = string token |
Method to use to fit the model (`generalizednonlinear` , `nonlinear` ) default `nonl` for Wadley’s problem, otherwise `gene` |

`MAXCYCLE` = scalar |
Maximum number of iterations for fitting the model; default 30 |

### Parameters

`Y` = variates |
Number of subjects responding in each batch |
---|---|

`DOSE` = variates |
Dose received by each batch of subjects |

`NBINOMIAL` = variates, scalars or factors |
Variate specifying the number of subjects in each batch, or factor specifying groupings of the observations assumed to have equal expected total numbers of subjects in Wadley’s problem; if omitted, assumes Wadleys’s problem with all observations having the same expected total number of subjects |

`INITIAL` = variates |
Initial values for parameters |

`STEPLENGTHS` = variates |
Step lengths for parameters |

`LDESTIMATES` = variates |
Saves estimates of the effective, or lethal, doses |

`LDLOWER` = variates |
Saves lower values of the confidence intervals for the estimates of the effective, or lethal, doses (for `FITMETHOD=gene` only) |

`LDUPPER` = variates |
Saves upper values of the confidence interval values for the estimates of the effective, or lethal, doses (for `FITMETHOD=gene` only) |

### Description

Probit analysis is a way of modelling the relationship between a stimulus, like a drug, and a quantal response (success/failure). It is assumed that for each subject, there is a certain level of dose of the stimulus below which it will unaffected, but above which it will respond. This level of dose, known as its tolerance, will vary from subject to subject within the population.

For example, it is often assumed that the tolerance of houseflies to logarithm of the dose of an insecticide will follow a Normal distribution; so, if we were to plot the proportion of the population with each tolerance against log dose, we would obtain the familiar bell-shaped curve. Likewise, if we plot the probability that a randomly-selected individual will respond, against the logarithm of dose, we would obtain a sigmoid (S-shaped) curve limited below by zero and above by one. To make the relationship linear, it is usual to transform the y-axis either to probits or to Normal equivalent deviates. In Genstat

Probit(*P*%) = NED(*P*%/100)

The Normal equivalent deviate may be familiar as the transformation that is used to produce “probability” graph paper.

In probit analysis, we are interested in estimating the equation of that line. This can be done by perfoming an experiment in which there are several batches of subjects, each of which is given a different dose of the stimulus. The data then consists of a variate indicating the number of subjects that responded out of each batch, a variate to show the dose given to each batch, and a final variate for the total numbers of subjects in the batches; these are specified by parameters `Y`

, `DOSE`

and `NBINOMIAL`

, respectively.

The `NBINOMIAL`

parameter can be omitted if the total numbers cannot be measured, as in some fumigation experiments (“Wadley’s problem”; see for example Finney 1971, pages 202-8). The assumption is that the total numbers receiving the doses will come from the same Poisson distribution, and the mean of this distribution is then estimated in the analysis. Alternatively, `NBINOMIAL`

can specify a factor to indicate groupings of the doses whose total numbers are expected to come from the same distributions.

The `PRINT`

option controls printed output:

`model` |
details of the model that has been fitted, |
---|---|

`summary` |
summary analysis-of-variance table, |

`estimates` |
parameter estimates and standard errors, |

`correlations` |
correlations between parameter estimates, |

`fittedvalues` |
fitted values and residuals, |

`monitoring` |
information about the fitting process, and |

`effectivedoses` |
effective, or lethal, doses (see parameter `LD` below). |

By default, `PRINT=mode,summ,esti,fitt,effe`

.

The `TRANSFORMATION`

option allows other transformations to be selected. Putting `TRANSFORMATION=logit`

requests a logit transformation:

logit(*P*%) = log( *P*% / (100 – *P*%) )

This is very like the probit but approaches zero (to the left) and one (to the right) rather more slowly. The other possibility is the complementary log-log ( =log( -log(100-*P*%) ), which is relevant to the “one-hit” model (that is infection processes where just one infected particle is sufficient to cause the response).

Sometimes, subjects may respond even in the absence of any dose. For example, with some short-lived insects, some would have died simply from natural causes during the period of the experiment. By setting option `MORTALITY=estimate`

this natural mortality can be included in the model and estimated. Similarly, there may be subjects that will not respond, no matter how high the dose. Setting option `IMMUNITY=estimate`

will include and estimate a parameter for natural immunity.

It is also often of interest to fit study the way in which the model varies for different groups of subjects. For example, there may be groups of batches of subjects, each of which is given a different drug. The `GROUPS`

option should then specify the group to which each batch of subjects belongs, and option `SEPARATE`

indicates which parameters of the model (slope, mortality, and/or immunity) should have separate estimates. Separate parameters are always fitted for the intercept unless you include the setting `notintercept`

. So, if `SEPARATE`

is left at its default value, parallel lines will be fitted with identical values for any estimates of mortality and immunity.

The `LD`

option can request the estimation of one or more effective (or lethal) doses, specifying a scalar if there is just one, or a variate if there are several. The `LOGBASE`

option is useful if the doses have been transformed to logarithms before calling `PROBITANALYSIS`

. If you use `LOGBASE`

to specify the base of the logarithms (`ten`

or `e`

), the back-transformed lethal doses will be printed as well.

The estimates of the effective (or lethal) doses can be saved, in a variate, by the `LDESTIMATES`

parameter. Also, when model is fitted as a generalized nonlinear model (see the `FITMETHOD`

option, below), the lower and upper values of the confidence intervals for the estimates can be saved by the `LDLOWER`

and `LDUPPER`

parameters, respectively. If `LOGBASE`

is set, these are all back-transformed. The `CIPROBABILITY`

option specifies the probability level for the confidence intervals; the default is 0.95, i.e. 95% confidence intervals.

The `DISPERSION`

option can be used to request use of a heterogeneity factor in the calculation of the standard errors of the slopes and lethal doses (see Finney 1971, pages 70-74). The standard assumptions for probit analysis are that the observations have binomial distributions in probit lines and planes, or Poisson distributions in Wadley’s problem. Under these circumstances, the residual deviance will follow a Chi-square distribution. The residual deviance should on average be equal to its number of degrees of freedom. A significantly large value may indicate that there are other (possibly unknown) factors affecting the subjects, for example that the conditions were not uniform during the experiment. Alternatively it may occur because the subjects did not react independently, for example because there were sub-populations of genetically related individuals. If the large Chi-square seems to arise because the residuals are larger in general than expected (overdispersion) and not because of systematic deviations from the fitted relationship, it is sensible to increase the standard errors by a heterogeneity factor equal to the residual mean deviance. This can be requested by setting option `DISPERSION=*`

. Alternatively `DISPERSION`

can be set to a known value if one is available.

When the `FITMETHOD`

option is set to `generalizednonlinear`

, the model is fitted as a generalized nonlinear model, using the `FIT`

directive. The alternative setting, `nonlinear`

, fits it as a nonlinear model using `FITNONLINEAR`

. Apart from minor numerical differences, the two methods should generate the same results. Generalized nonlinear models allow a confidence region to be generated for lethal doses, and these are used as default for all situations except Wadley’s problem. The nonlinear method is more accurate, and is thus used as the default for the more difficult situation presented by Wadley’s problem. However, there is the limitation that you cannot use the `notintercept`

setting of the `SEPARATE`

option with the nonlinear method.

The final two parameters, `INITIAL`

and `STEPLENGTHS`

, allow initial values and step lengths to be specified for the optimization. For a generalized nonlinear model, the order of parameters is: total(s) for Wadley’s problem (if appropriate), mortality parameters (if any) and immunity parameters (if any); the slopes and intercepts are fitted as regression parameters. For a nonlinear model, the order of parameters is: LD50(s), slope(s), mortality parameters (if any) and immunity parameters (if any); the totals for Wadley’s problem, if required, as fitted as linear parameters. The `MAXCYCLE`

option sets a limit on the number of iteractions used during fitting (default 30). Parameter estimates, fitted values, residuals, and so on, can be saved after running the procedure, by using the `RKEEP`

directive in the usual way.

Options: `PRINT`

, `TRANSFORMATION`

, `MORTALITY`

, `IMMUNITY`

, `GROUPS`

, `SEPARATE`

, `LD`

, `CIPROBABILITY`

, `LOGBASE`

, `DISPERSION`

, `FITMETHOD`

, `MAXCYCLE`

.

Parameters: `Y`

, `DOSE`

, `NBINOMIAL`

, `INITIAL`

, `STEPLENGTHS`

, `LDESTIMATES`

, `LDLOWER`

, `LDUPPER`

.

### Method

For `FITMETHOD=generalizednonlinear`

a calculated link is used to take account of any mortality or immunity parameters, and a calculated distribution to allow estimation of totals for Wadley’s problem. The fitting is carried out by `FIT`

(with the `CALCULATION`

option set if any totals, mortality or immunity parameters are to be estimated), and procedure `FIELLER`

is used to obtain LD values.

For `FITMETHOD=nonlinear`

initial values are obtained, if necessary, using the Genstat facilities for generalized linear models, ignoring any mortality or immunity. Expressions specifying the model are defined in sets of nested `IF`

-blocks, taking account of the settings for example of `TRANSFORMATION`

and `GROUPS`

. The fitting is carried out by the `FITNONLINEAR`

directive, and any extra LD values are estimated using `RFUNCTION`

.

### Action with `RESTRICT`

The `Y`

variate, the `DOSE`

variate, or the `GROUPS`

factor can be restricted to indicate that the model is to be fitted only to a subset of the units.

### Reference

Finney, D.J. (1971). *Probit Analysis (third edition)*. Cambridge University Press, Cambridge.

### See also

Commands for: Regression analysis.

### Example

CAPTION 'PROBITANALYSIS.',\ !t('Data from Finney, Probit Analysis, 3rd Edition, pages 132-133.',\ 'Parallel lines are fitted to data from 2 different derris roots.',\ 'The insects (grain beetles) are subject to natural mortality;',\ 'a single (common) parameter is fitted for both roots.',\ 'The results differ slightly from those of Finney due to',\ 'the use here of maximum likelihood for the fitting,', 'rather than iterative weighted linear regression.'); STYLE=meta,plain VARIATE [VALUES=2.17,2.00,1.68,1.08,1.79,1.66,1.49,1.17,0.57, *] Logdose & [VALUES= 142, 127, 128, 126, 125, 117, 127, 51, 132,129] Total & [VALUES= 142, 126, 115, 58, 125, 115, 114, 40, 37, 21] Kill FACTOR [LEVELS=2; LABELS=!t(W213,W214); VALUES=4(1),5(2),*] Derris PROBITANALYSIS [TRANS=probit; MORTALITY=estimate; GROUPS=Derris; LD=!(50,90)]\ Kill; DOSE=Logdose; NBINOMIAL=Total