Calculates nonparametric estimates of species richness (D.A. Murray).

### Options

`PRINT` = string token |
Controls printed output (`summary` , `estimates` ); default `summ` , `esti` |
---|---|

`GROUPS` = factor |
Grouping factor for different samples |

`NBOOT` = scalar |
A scalar defining the number of bootstrap samples to be performed; default 100 |

`SEED` = scalar |
Seed for random number generator; default 0 |

### Parameters

`DATA` = variates, matrices or pointers |
A variate containing abundances of species or a pointer or matrix specifying the individuals for each species for different sites/samples |
---|---|

`ESTIMATES` = variates or pointer |
Saves the estimated species richness in a variate, or in a pointer if `GROUPS` are specified |

`SE` = variates or pointers |
Saves the analytic standard errors in a variate, or in a pointer if groups are specified |

`BSE` = variates or pointers |
Saves the bootstrap standard errors in a variate, or in a pointer if groups are specified |

### Description

Richness is the measure of the number of species within a sample. `ECNPESTIMATE`

provides a number of nonparametric estimators for measuring true species richness. These estimators include the Chao 1, Chao 2, ACE, ICE, first-order jackknife, second-order jackknife and bootstrap. The Chao 1 and ACE are based on the abundances within the samples, whereas the other estimators are incidence-based using frequencies of species in a set of samples. Standard errors are calculated using analytical results where possible. In addition, for multiple samples, standard errors are calculated by resampling with replacement.

The data can be supplied using the `DATA`

parameter either as a matrix where the rows contain the number of individuals for each species and the columns specify the different samples or sites, or as a pointer to variates containing samples for the individuals for each species. Alternatively, the individual species numbers can be supplied in a variate for a single sample/site. The `GROUPS`

option can supply a grouping factor to produce estimates for different groups. The estimates and standard errors can be saved using the `ESTIMATES`

, `SE`

(analytic standard errors) and `BSE`

(bootstrap standard errors) parameters. If a grouping factor is supplied then they will be saved in a pointer to variates, otherwise they are saved in a variate.

The `PRINT`

option controls printed output, with settings:

`summary` |
a summary of the data, |
---|---|

`estimates` |
the species richness estimates and standard errors. |

The `NBOOT`

option specifies how many bootstrap samples to take to calculate the bootstrap standard errors and confidence intervals (default 100). The probability level for the confidence interval can be set by the `CIPROBABILITY`

option; by default 0.95. The `SEED`

option specifies the seed to use in the random number generator used to construct the bootstrap samples. The default value of zero continues an existing sequence of random numbers or, if the generator has not yet been used in this run of Genstat, it initializes the generator automatically.

Options: `PRINT`

, `GROUPS`

, `NBOOT`

, `SEED`

.

Parameters: `DATA`

, `ESTIMATES`

, `SE`

, `BSE`

.

### Method

The Chao 1 estimator of the absolute number of species in an assemblage is calculated by:

s(Chao 1) = *S _{obs}* +

*F*

_{1}

^{2}/ (2 ×

*F*

_{2})

where *S _{obs}* is the number of species in the sample,

*F*

_{1}is the number of observed species represented by a single individual (frequency of singletons), and

*F*

_{2}is the number of species that have exactly two individuals (frequency of doubletons). The variance for the estimate is given by:

var(Chao 1) = *F*_{2} × { 0.5 × (*F*_{1} / *F*_{2})^{2} + (*F*_{1} / *F*_{2})^{3} + 0.25 × (*F*_{1} / *F*_{2})^{4} }

When *F*_{2} equals 0 the modified bias-corrected estimate is used:

s(Chao 1) = *S _{obs}* +

*F*

_{1}× (

*F*

_{1}– 1) / 2

and

var(Chao 1) = {*F*_{1} × (*F*_{1}-1) / 2} + {*F*_{1} × (2×*F*_{1}-1)^{2} / 4} – *F*_{1}^{4} / (4 × s(Chao 1))

The Chao 2 estimator is calculated by:

s(Chao 2) = *S _{obs}* +

*Q*

_{1}

^{2}/ (2 ×

*Q*

_{2})

where *S _{obs}* is the number of species in sample,

*Q*

_{1}is the number of species that occur in exactly one sample (uniques), and

*Q*

_{2}is the number of species that occur in exactly two samples (duplicates). The variance for the estimate is given by:

var(Chao 2) = *Q*_{2} × { 0.5 × (*Q*_{1} / *Q*_{2})^{2} + (*Q*_{1} / *Q*_{2})^{3} + 0.25 × (*Q*_{1} / *Q*_{2})^{4} }

When *Q*_{2} equals 0 the modified bias-corrected estimate is used:

s(Chao 2) = *S _{obs}* +

*Q*

_{1}× (

*Q*

_{1}– 1) / 2

and

var(Chao 2) = {(*H* – 1) / *H*} × *Q*_{1} × (*Q*_{1} – 1) / 2

+ {(*H* – 1) / *H*}^{2} × *Q*_{1} × {2 × *Q*_{1} – 1)^{2}} / 4

+ {(*H* – 1) / *H*}^{2} × *Q*_{1}^{4} / (4 × Chao2)

where *H* is the total number of samples.

The first-order jackknife estimate is evaluated by:

s(jack1) = *S _{obs}* +

*Q*

_{1}× (

*H*– 1) /

*H*

with variance

var(jack1) = {(*H* – 1) / *H*} × { ∑_{j=1…S} (*j*^{2} × *f _{j}*) – (

*Q*

_{1}

^{2}/

*H*) }

where *S* is the number of species, *Q*_{1} is the number of species that occur in exactly one sample and *f _{j}* is the number of samples with

*j*unique species.

The second-order jackknife estimate is given by:

s(jack2) = *S _{obs}* +

*Q*

_{1}× (2 ×

*H*– 3) /

*H*–

*Q*

_{2}× (

*H*– 2)

^{2}/ {

*H*× (

*H*– 1)}

where *Q*_{1} is the number of species that occur in exactly one sample, and *Q*_{2} is the number of species that occur in exactly two samples.

The bootstrap estimate is calculated by:

s(boot) = *S _{obs}* + ∑

_{j=1…S}(1 –

*p*)

_{j}^{H}

where *p _{j}* is the proportion of species

*j*. The variance is calculated using the method given in Smith & van Belle (1984).

The abundance-based coverage estimator (ACE) is given by:

s(ACE) = *S _{abund}* +

*S*/

_{rare}*C*+ (

_{ACE}*F*

_{1}/

*C*) × γ

_{ACE}^{2}

where *S _{abund}* is the number of abundant species (>10),

*S*is the number of rare species (≤10),

_{rare}*F*

_{1}is the number of singletons,

*C _{ACE}* = 1 –

*F*

_{1}/

*N*

_{rare}where *N _{rare}* is the total number of individuals in rare species, and

γ = max {(S_{rare}/*C _{ACE}*) × ∑

_{i=1…10}{

*i*× (

*i*-1) ×

*F*} / (

_{i}*N*× (

_{rare}*N*– 1)) – 1, 0}

_{rare}The incidence-based coverage estimator (ICE) is given by:

s(ICE) = *S _{freq}* +

*S*/

_{infr}*C*+ (

_{ICE}*Q*

_{1}/

*C*) × γ

_{ICE}^{2}

where *S _{freq}* is the number of frequent species (>10),

*S*is the number of infrequent species (<=10),

_{infr}*Q*

_{1}is the number of uniques,

*C*= 1 –

_{ICE}*Q*

_{1}/

*N*where

_{infr}*N*is the total number of occurrences of infrequent species, and

_{infr}γ = max{(*S _{infr}*/

*C*) × (

_{ICE}*M*/(

_{infr}*M*-1)) × (∑

_{infr}_{i=1…10}{

*i*× (

*i*-1) ×

*Q*} /

_{i}*N*

_{infr}^{2}) – 1, 0}

where *M _{infr}* is the number of samples with at least one infrequent species.

The bootstrap standard errors are generated using the `BOOTSTRAP`

procedure sampling with replacement, and the species richness estimates are calculated from these samples.

### Action with `RESTRICT`

If the data are in a variate, the statistics are calculated using only those units included in the restriction. If data are in a pointer or matrix, the restriction are ignored.

### References

Chao, A. (1987). Estimating the population size for capture-recapture data with unequal catchability. *Biometrics*, 43, 783-791.

Magurran, A.E. (2003). *Measuring Biological Diversity*. Blackwell, Oxford.

Smith, E.P. & van Belle, G. (1984). Nonparametric estimation of species richness. *Biometrics*, 40, 119-129.

### See also

Commands for Ecological data.

### Example

CAPTION 'ECNPESTIMATE example',\ 'Data from Helshe & Forrester (1983), Biometrics, pages 1-19';\ STYLE=meta,minor POINTER [NVALUES=10] quad VARIATE [VALUES=0,2,0,1,0,1,1,2,0,0,0,0,0,8] quad[1] VARIATE [VALUES=13,2,1,0,0,1,0,0,1,0,0,0,0,36] quad[2] VARIATE [VALUES=21,4,0,1,1,2,0,0,0,1,3,5,0,14] quad[3] VARIATE [VALUES=14,4,0,2,2,1,0,0,0,0,0,1,0,19] quad[4] VARIATE [VALUES=5,1,0,0,0,0,0,0,0,0,0,0,0,3] quad[5] VARIATE [VALUES=22,1,0,6,0,1,0,0,0,0,0,2,0,22] quad[6] VARIATE [VALUES=13,1,0,0,1,0,0,0,0,0,0,0,0,6] quad[7] VARIATE [VALUES=4,0,1,0,0,0,0,0,0,0,0,0,1,8] quad[8] VARIATE [VALUES=4,1,0,1,0,1,0,0,0,0,0,0,0,5] quad[9] VARIATE [VALUES=27,6,0,2,1,5,0,0,0,0,2,3,0,41] quad[10] ECNPESTIMATE [SEED=204029] quad