Fits a standard nonlinear regression model.
Options
PRINT = string tokens |
What to print (model , deviance , summary , estimates , correlations , fittedvalues , accumulated , monitoring ); default mode , summ , esti |
---|---|
CURVE = string token |
Type of curve (exponential , dexponential , cexponential , lexponential , logistic , glogistic , gompertz , ldl , qdl , qdq , fourier , dfourier , gaussian , dgaussian , emax , gemax ); default expo |
SENSE = string token |
Sense of curve (right , left ); default righ |
ORIGIN = scalar |
Constrained origin; default * |
NONLINEAR = string token |
How to treat nonlinear parameters between groups (common , separate ); default comm |
CONSTANT = string token |
How to treat the constant (estimate , omit ); default esti |
FACTORIAL = scalar |
Limit for expansion of model terms; default as in previous TERMS statement, or 3 if no TERMS given |
POOL = string token |
Whether to pool ss in accumulated summary between all terms fitted in a linear model (yes , no ); default no |
DENOMINATOR = string token |
Whether to base ratios in accumulated summary on rms from model with smallest residual ss or smallest residual ms (ss , ms ); default ss |
NOMESSAGE = string tokens |
Which warning messages to suppress (dispersion , leverage , residual , aliasing , marginality, vertical ); default * |
FPROBABILITY = string token |
Printing of probabilities for variance ratios (yes , no ); default no |
SELECTION = string tokens |
Statistics to be displayed in the summary of analysis produced by PRINT=summary (%variance , %ss , adjustedr2 , r2 , seobservations , dispersion , %cv , %meandeviance , %deviance , aic , bic , sic ); default %var , seob |
Parameter
formula | Explanatory variate, list of variate and factor, or variate* factor |
---|
Description
FITCURVE
provides a convenient way of fitting various standard curves. The response variate must be specified beforehand, using the MODEL
directive in the usual way. The parameter of FITCURVE
can be set just to a variate that supplies the x-values for the curve, if you simply want to fit a single curve. You can also include a factor if you want to fit separate curves for different groups of the observations: these are then parallel curves. The interaction between the variate and the factor can also be included, representing curves constrained to have common nonlinear parameters but separate linear parameters for each level of the factor. Finally, if the NONLINEAR
option is set to separate
as well as including the interaction, separate curves are fitted for each level, with only the estimate of variability being pooled.
The CURVE
option specifies which of the standard curves is to be fitted. For some of these, the SENSE
option allows you to choose between alternative forms. Before describing the curves in detail, here is a list for convenient reference:
exponential | y = α + β × ρx + ε |
---|---|
dexponential | y = α + β × ρx + γ × σx + ε |
cexponential | y = α + (β + γ × x) × ρx + ε |
lexponential | y = α + β × ρx + γ × x + ε |
ldl | y = α + β / (1 + δ × x) + ε |
---|---|
qdl | y = α + β / (1 + δ × x) + λ × x + ε |
qdq | y = α + (β + γ × x)/(1 + δ × x + η × x2) + ε |
fourier | y = α + β × sin(2π × (x – η) / ω) + ε |
---|---|
dfourier | y = α + β × sin(2π × (x – η) / ω) |
+ γ × sin(4π × (x – η) / ω) + ε |
gaussian | y = α + (β / √(2πσ2)) × exp(-(x – μ)2/(2σ2)) + ε |
---|---|
dgaussian | y = α + (β / √(2πσ2)) × exp(-(x – μ)2/(2σ2)) |
+ (γ / √(2πσ2)) × exp(-(x – ν)2/(2σ2)) + ε |
The four exponential curves each arise as solutions of linear ordinary differential equations. These represent processes that increase exponentially with time, for example, or that increase with a law of diminishing returns (that is, for which the rate of increase decreases with time).
The default setting of the CURVE
option is exponential
, corresponding to the “asymptotic regression” or Mitscherlich curve. The model has only one nonlinear parameter, ρ, which defines the rate of exponential increase or decrease. FITCURVE
estimates the other parameters by linear regression at each stage of an iterative search for the best estimate of ρ. The values of the explanatory variate are automatically scaled to avoid any computational problems near the boundary of the allowed values of ρ. By default, ρ is restricted to the range 0<ρ1, which can be requested by setting the SENSE
option to left
: for all the exponential curves, SENSE=left corresponds to a curve whose asymptote is to the left – that is, as X decreases to -∞. If Genstat finds that a better fit is obtained by the opposite sense to the one specified, the sense is reversed and a warning is printed. The parameter α is the asymptote – to the right if ρ<1 and to the left if ρ<1; β is the range of the curve between the value at X=0 and the asymptote.
The double exponential curve also has two forms: you can choose either 0<ρ<1 and 0<σ1 and σ>1, by using the SENSE
option as for the exponential curve. The fitting process is unlikely to find a satisfactory solution for this curve unless there are enough data to estimate both components separately: there should be at least four points for which the fast component is larger than the slow component; the fast component corresponds to the smaller of ρ and σ when SENSE=right
, or to the larger of ρ and σ when SENSE=left
.
Two limiting cases of the double exponential are provided as special curves. The critical exponential curve can take a variety of shapes like the double exponential, whereas the line-plus-exponential curve is an exponential curve with a non-horizontal asymptote. Again here, the constraint on the parameter ρ depends on the setting of the SENSE
option as for the exponential curve.
Another type of standard curve is sigmoid and monotonic, and is often used to model the growth of biological subjects. There are five types of these growth curves in Genstat, each a logistic of some sort. The first type is the generalized logistic without any constraints. In the equation above, α is one asymptote, to the right or to the left according to whether β is positive or negative; μ is the point of inflexion for the explanatory variable; β is a slope parameter; τ is a power-law parameter; and α+γ is the other asymptote. To fit this curve you need data for the steep central part and for both flat parts.
There are two special cases of the generalized logistic. The ordinary logistic curve is sometimes known as the autocatalytic or inverse exponential curve. The same curve can be rewritten in several different forms, so you should be alert for concealed equivalences of apparently different curves: otherwise you might be tempted to use FITNONLINEAR
, which would be less efficient. The other special case is the Gompertz curve. It is non-symmetrical about the inflexion, X=μ, and has asymptotes at Y=α and Y=α+γ.
You can fit these three growth curves to data in which Y decreases as X increases. For the logistic and generalized logistic curves, you are not allowed to constrain the sense of the curve by the SENSE
option. This is because the sense depends on both the parameters β and γ. In fact, the logistic curve with parameters α, β, γ and μ is the same as the logistic curve with parameters (α+γ), -β, -γ and μ; Genstat will report only one of the two possible versions. For the Gompertz curve, you can set SENSE=left
to specify the upside-down Gompertz curve corresponding to γ<0; otherwise γ is constrained to be positive. When the sign of γ is changed for a response Y that increases with X, the sign of β will also change so that the curve remains an ascending one, and similarly for descending curves. The interpretation of SENSE=left
thus depends on the shape of the data; for ascending curves it means that the asymptote is reached more slowly to the left than to the right, but for descending curves it means the opposite.
The final two sigmoid curves, Emax and generalized Emax, are similar to the logistic and generalized logistic except that their equations involve log(x) instead of x. They are usually used to model decreasing relationships with the parameter β in the equation negative, but Genstat will allow increasing relationships with these curves too.
The three rational functions are ratios of polynomials. The linear-divided-by-linear curve is a rectangular hyperbola, which occurs for example as the Michaelis-Menten law of chemical kinetics. The quadratic-divided-by-linear curve is a hyperbola with a non-horizontal asymptote. The quadratic-divided-by-quadratic curve is a cubic curve having an asymmetric maximum falling to an asymptote. The SENSE
option is ignored for all three rational functions.
Fourier curves are trigonometric functions, involving the sine function in Genstat’s implementation, used to model periodic behaviour. Sometimes the wavelength or period ω is a known constant, such as 2π radians (or 360 degrees), 24 hours or 12 months; the models are then linear and should be fitted by linear regression using the FIT
directive, instead of by FITCURVE
. The parameters β and γ are the amplitudes of the components of the curve. The SENSE
option is ignored for Fourier curves.
The Gaussian curve is a bell-shaped curve like the Normal probability density. The double Gaussian is a sum of two overlapping curves of this type, and arises for example in spectography. The parameter α is usually called the background, and the parameters μ and ν are the peaks. The parameter σ is the standard deviation: for the double Gaussian, FITCURVE
can deal only with the case of equal standard deviation for the two components. The parameters β and γ represent the strength of a spectrographic signal in each component, excluding the background. The SENSE
option is ignored for Gaussian curves.
The PRINT
, FACTORIAL
, POOL
, DENOMINATOR
, NOMESSAGE
and FPROBABILITY
options are as for the FIT
directive, and SELECTION
differs only because FITCURVE
caters only for the Normal distribution.
You can constrain the exponential and rational curves to pass through a given point. The ORIGIN
option specifies a value for the response variate corresponding to a zero value of the explanatory variate; to specify the response for another value of the explanatory variate you would need to modify the explanatory variate beforehand.
Another way of constraining the curves is by setting the CONSTANT
option to omit the constant term. This parameter represents an asymptote of each curve. To constrain the asymptote to be other than 0, you should put the value that you require into every element of the variate in the OFFSET
option of the MODEL
directive. The constant cannot be omitted from the Gompertz fitted with SENSE=left
, nor (for any curve) if the origin is constrained, nor if parallel curves are fitted.
You can use the WEIGHTS
option of the MODEL
directive to supply a variate of weights for the units. You can also supply a symmetric matrix of weights, for example to allow for covariances between units. However, if the model contains an explanatory factor, pairs of units with different factor levels must have zero covariances.
You can modify a model fitted by FITCURVE
by using the ADD
, DROP
or SWITCH
directives as for models fitted by the FIT
directive, provided the alterations produce a model that would be allowed in FITCURVE
: that is, it must contain one variate, or one variate and one factor, or one variate and one factor and their interaction. The NONLINEAR
options of the ADD
, DROP
and SWITCH
directives have the same effect as the NONLINEAR
option of FITCURVE
. Thus you can compare curves between groups of a factor, assessing for example whether they are parallel. The accumulated
setting of the PRINT
option of these directives allows you to summarize the results. The RDISPLAY
directive can be used to display further output following FITCURVE
, and results can be copied into Genstat data structures using the RKEEP
directive.
Options: PRINT
, CURVE
, SENSE
, ORIGIN
, NONLINEAR
, CONSTANT
, FACTORIAL
, POOL
, DENOMINATOR
, NOMESSAGE
, FPROBABILITY
, SELECTION
.
Parameter: unnamed.
Action with RESTRICT
You can restrict the units that Genstat will use for fitting the curve by putting a restriction on the response or offset variates (defined by the MODEL
directive), or on the explanatory variate or factor in the FITCURVE
statement. However, you are not allowed to have different restrictions on the different vectors.
See also
Directives: MODEL
, TERMS
, RDISPLAY
, RKEEP
, RKESTIMATES
, RCYCLE
, RFUNCTION
, ADD
, DROP
, SWITCH
, FIT
, FITNONLINEAR
.
Procedures: RCHECK
, RGRAPH
, RDESTIMATES
, MICHAELISMENTEN
, NLAR1
, HGNONLINEAR
, RQNONLINEAR
, RQUADRATIC
, RSCHNUTE
, DFUNCTION
.
Commands for: Regression analysis.
Example
" Example FITC-1: Fit an exponential curve Sugar beet does not grow well if the soil is deficient in phosphate. An experiment with 16 plots having a range of soil phosphate levels provides measurements of weight of beet, %sugar in the beet and soil phosphate level." FILEREAD [NAME='%gendir%/examples/FITC-1.DAT'] Beetwt,%sugar,SoilP " Calculate the yield of sugar, and draw a graph." CALCULATE Sugar = Beetwt * %sugar / 100 DGRAPH [WINDOW=3; KEYWINDOW=0] Sugar; SoilP " Fit an exponential model relating sugar yield to phosphate level. The model is Sugar = A + B * R**Phosphate which is equivalent to Sugar = A + B * EXP(K * Phosphate) with R = EXP(K). The parameterization with R is used because it is more stable." MODEL Sugar FITCURVE [CURVE=exponential; PRINT=model,estimates] SoilP " Display a summary of the model." RDISPLAY [PRINT=summary] " The first four units are very influential: without them, there is little information about the curvature of the model." " Display the fitted curve." RGRAPH [TITLE='Saxmundham RII 1969'; GRAPHICS=high]] " Check whether there is an additional trend with phosphate." FITCURVE [CURVE=lexponential] SoilP " The extra parameter, C, is very small and certainly not statistically significant. This result suggests that no extra phosphate fertilizer need be applied for sugar beet if soil phosphorus is above about 15 ppm, in the weather conditions prevailing in Suffolk in 1969."