Calculates QTL effects in multi-environment trials or multiple populations (M.P Boer, M. Malosetti, S.J. Welham & J.T.N.M. Thissen).

### Options

`PRINT` = string tokens |
What to print (`summary` , `model` , `components` , `effects` , `means` , `stratumvariances` , `monitoring` , `vcovariance` , `deviance` , `Waldtests` , `missingvalues` , `covariancemodels` ); default `summ` |
---|---|

`POPULATIONTYPE` = string token |
Type of population (`BC1` , `DH1` , `F2` , `RIL` , `BCxSy` , `CP` ); must be set |

`NGENERATIONS` = scalar |
Number of generations of selfing for a `RIL` population |

`NBACKCROSSES` = scalar |
Number of backcrosses for a `BCxSy` population |

`NSELFINGS` = scalar |
Number of selfings for a `BCxSy` population |

`VCMODEL` = string token |
Specifies the variance-covariance model for the set of environments or populations (`identity` , `diagonal` , `cs` , `hcs` , `outside` , `fa` , `fa2` , `unstructured` ); default `cs` for multi-environment trials, and `diagonal` for multiple populations |

`VCPARAMETERS` = string token |
Whether to re-estimate the variance-covariance model parameters (`estimate` , `fix` ); default `esti` |

`VCSELECT` = string token |
Whether to re-select the variance-covariance model (`no` , `yes` ); default `no` |

`CRITERION` = string token |
Criterion to use for model selection (`aic` , `sic` ); default `sic` |

`FIXED` = formula |
Defines extra fixed effects |

`UNITFACTOR` = factor |
Saves the units factor required to define the random model when `UNITERROR` is to be used |

`MVINCLUDE` = string tokens |
Whether to include units with missing values in the explanatory factors and variates and/or the y-variates (`explanatory` , `yvariate` ); default `expl` , `yvar` |

`MAXCYCLE` = scalar |
Limit on the number of iterations; default 100 |

`WORKSPACE` = scalar |
Number of blocks of internal memory to be set up for use by the `REML` algorithm; default 100 |

### Parameters

`TRAIT` = variates |
Quantitative trait to be analysed; must be set |
---|---|

`GENOTYPES` = factors |
Genotype factor; must be set |

`ENVIRONMENTS` = factors |
Environment factor; must be set for a multi-environment trial |

`POPULATIONS` = factors |
Population factor; must be set for a multiple-population analysis |

`UNITERROR` = variates |
Uncertainty on trait means (derived from individual unit or plot error) to be included in QTL analysis; default `*` i.e. omitted |

`VCINITIAL` = pointers |
Initial values for the parameters of the variance-covariance model |

`SELECTEDMODEL` = texts |
`VCMODEL` setting for the selected covariance structure |

`ADDITIVEPREDICTORS` = pointers |
Additive genetic predictors; must be set |

`ADD2PREDICTORS` = pointers |
Second (paternal) set of additive genetic predictors |

`DOMINANCEPREDICTORS` = pointers |
Dominance genetic predictors |

`CHROMOSOMES` = factors |
Chromosomes corresponding to the genetic predictors; must be set |

`POSITIONS` = variates |
Positions on the chromosomes corresponding to the genetic predictors; must be set |

`IDLOCI` = texts |
Labels for the loci; must be set |

`MKLOCI` = variates |
Logical variate containing the value 1 if the locus is a marker, otherwise 0; must be set |

`IDMGENOTYPES` = texts |
Labels for the genotypes corresponding to the genetic predictors |

`IDPARENTS` = texts |
Labels to identify the parents |

`QTLSELECTED` = variates |
Index numbers of the selected QTLs; must be set |

`INTERACTIONS` = variates |
Logical variate indicating whether each selected QTL has a significant (1) or non-significant (0) QTL-by-environment or QTL-by-population interaction |

`DOMSELECTED` = variates |
Logical variate indicating whether the dominance predictor of each selected QTL must be present (1) or absent (0) in the model |

`DOMINTERACTIONS` = variates |
Logical variate indicating whether the dominance-by-environment or dominance-by-population interaction of each selected QTL must be present (1) or absent (0) in the model |

`RESIDUALS` = variates |
Residuals from the analysis |

`FITTEDVALUES` = variates |
Fitted values from the analysis |

`WALDSTATISTICS` = variates |
Saves the Wald test statistics |

`PRWALD` = variates |
Saves the associated Wald probabilities |

`DFWALD` = variates |
Saves the degrees of freedom for the Wald test |

`QEFFECTS` = pointers |
Saves the estimated QTL effects |

`QSE` = pointers |
Saves the standard errors of the QTL effects |

`OUTFILENAME` = texts |
Name of the Genstat workbook file (`*.gwb` ) to be created |

`QSAVE` = pointers |
Saves a pointer with information and results for the significant effects |

`SAVE` = `REML` save structures |
Save the details of each `REML` analysis for use in subsequent `VDISPLAY` and `VKEEP` directives |

### Description

`QMESTIMATES`

fits a final QTL model to estimate QTL effects in a multi-environment trial or for multiple populations. The procedure uses means per genotype-environment or genotype-population combinations as phenotypic data, but weights can be attached to the means (see the `UNITERROR`

parameter and the `UNITFACTOR`

option below). The response variable must be specified by the `TRAIT`

parameter, and the corresponding environment and genotype factors must be specified by the `ENVIRONMENTS`

and `GENOTYPES`

parameters, respectively. The `POPULATIONTYPE`

option must be set to specify the population from which the genotypes are derived. For recombinant inbred lines (`POPULATIONTYPE`

`=`

`RIL`

), the `NGENERATIONS`

option, must be set to supply the number of generations. For backcross inbred lines (`POPULATIONTYPE`

`=`

`BCxSy`

), the `NBACKCROSSES`

and `NSELFINGS`

options must be set to define the number of backcrosses to the first parent and the number of selfings, respectively. For a multiple-population analysis, the `POPULATIONS`

parameter should be set (to a factor) instead of `ENVIRONMENTS`

.

Molecular information must be provided in the form of additive genetic predictors stored in variates and supplied, in a pointer, by the `ADDITIVEPREDICTORS`

parameter. Non-additive effects can be included in the model by specifying dominance genetic predictors using the `DOMINANCEPREDICTORS`

parameter (e.g. in a F2 population). In the case of segregating F1 populations (outbreeders) two sets of additive genetic predictors must be specified, the maternal ones by the `ADDITIVEPREDICTORS`

parameter, and the paternal ones by the `ADD2PREDICTORS`

parameter. The corresponding map information for the genetic predictors must be given by the `CHROMOSOMES`

and `POSITIONS`

parameters. The labels for the loci must be supplied by the `IDLOCI`

parameter, and the labels for the genotypes in the marker data can be supplied by the `IDMGENOTYPES`

parameter. If `IDMGENOTYPES`

is set, the match between the genotypes in the phenotypic and in the marker data will be checked. The `IDPARENTS`

parameter can supply labels to identify the parents.

The QTL model assumes `ENVIRONMENTS`

(or `POPULATIONS`

) and QTLs as fixed terms, and `GENOTYPES`

as a random term. The `QTLSELECTED`

parameter must specify the set of QTLs, in the form of a variate containing the index number of the positions where the QTLs are located. The `INTERACTIONS`

parameter supplies a logical variate containing zero if a QTL effect is constrained to be constant across environments (or populations), and one if it is specific for each environment (or population). When the `DOMINANCEPREDICTORS`

parameter is set, the `DOMSELECTED`

parameter supplies a logical variate containing one if the dominance predictor of the corresponding marker must be present in the model, and zero if the dominance predictor of the corresponding marker must be absent in the model. If `DOMINANCEPREDICTORS`

is set but `DOMSELECTED`

is not set, all the dominance predictors are included. Similarly, the `DOMINTERACTIONS `

parameter supplies a logical variate containing one if the dominance-by-environment (or dominance-by-population) interaction of the corresponding marker must be present in the model, and zero if it must be absent. If `DOMINANCEPREDICTORS`

is set but `DOMINTERACTIONS`

is not set, all the dominance predictors are included.

Extra fixed effects can be defined by the `FIXED`

option. A multi-Normal distribution, with vector mean 0 and variance covariance matrix Σ is assumed for the random genetic effects in the different environments (or populations). The `VCMODEL`

option defines the model to use for Σ. The default assumes compound symmetry, but the `VGESELECT`

procedure can be used to assess what model would be most suitable. Initial values for the parameters in the variance-covariance model can be specified by the `VCINITIAL`

parameter. The `VCPARAMETERS`

option controls whether the variance-covariance parameters are re-estimated at each step of the backward selection (`VCPARAMETERS=estimate`

), or whether they are fixed at the defined initial values (`VCPARAMETERS=fix`

). The `VCSELECT`

option defines whether an extra check is made at each step on the variance-covariance model, to assess whether a simpler model is more suitable than the current model (based on the criterion defined by the `CRITERION`

option). The `SELECTEDMODEL`

parameter stores the final variance-covariance model that is selected.

The `MVINCLUDE`

, `MAXCYCLE`

and `WORKSPACE`

options operate in the same way as these options of the `REML`

directive. The `UNITERROR`

parameter allows uncertainty on the trait means (derived from individual unit or plot error) to be specified to include in the random model; by default this is omitted. The `UNITFACTOR`

option allows the factor that is needed to define the unit-error term to be saved (this would be needed, for example, to save information later about the term using `VKEEP`

).

The `PRINT`

option specifies the output to be displayed. The `summary`

setting prints the information about the QTLs retained in the model, and the other settings correspond to those in the `PRINT`

option of the `REML`

directive.

The QTL effects and their standard errors can be saved, in pointers, by the `QEFFECTS`

and `QSE`

parameters, respectively. These pointers have 2 levels of suffixes: the first level has 1, 2 or 3 values depending on the setting of the 3 possible predictors `ADDITIVEPREDICTORS`

, `ADD2PREDICTORS`

and `DOMINANCEPREDICTORS`

; the second level has as many levels as the number of levels of the `ENVIRONMENTS`

(or `POPULATIONS`

) factor. The fitted values and residuals can be saved by the `FITTEDVALUES`

and `RESIDUALS`

parameters. The Wald statistics, degrees of freedom and probabilities can be saved by the parameters `WALDSTATISTICS`

, `DFWALD`

and `PRWALD`

, respectively.

The `OUTFILENAME`

parameter can be used to save the Wald statistics and the `QEFFECTS`

and `QSE`

structures in a Genstat work book file in a sheet named `STATISTICS`

. This parameter should not contain an extension as the extension is defined automatically as `.gwb`

.

The `QSAVE`

parameter can be used to save a pointer containing information and results for the significant QTLs. The elements of the pointer are labelled as follows to simplify their subsequent use:

`'procedure'` |
stores the string `'QMESTIMATE'` to indicate the source of the results, |
---|---|

`'trait'` |
trait, |

`'markernames'` |
marker names, |

`'chromosomes'` |
chromosomes, |

`'positions'` |
positions, |

`'envnames'` |
names of the environments (or populations), |

`'waldstatistics'` |
wald statistics, |

`'prwald'` |
probability values of wald statistics, |

`'dfwald'` |
degrees of freedom of the wald statistics, |

`'qeffects'` |
QTL effects, |

`'qse'` |
standard errors of the QTL effects, |

`'%vexplained'` |
percentage variance explained, |

`'lowerci'` |
lower bound of confidence interval of estimated QTL position, |

`'upperci'` |
upper bound of confidence interval of estimated QTL position, |

`'posmin'` |
position of left flanking marker, |

`'posmax'` |
position of right flanking marker, |

`'idlfm'` |
marker name of left flanking marker, |

`'idrfm'` |
marker name of right flanking marker, |

`'posminci'` |
position of left flanking marker outside confidence interval, |

`'posmaxci'` |
position of right flanking marker outside confidence interval, |

`'idlfmci'` |
marker name of left flanking marker outside confidence interval, |

`'idrfmci'` |
marker name of right flanking marker outside confidence interval, |

`'locus'` |
index numbers of the significant QTLs, and |

`'neff'` |
number of additive and dominance predictors in the model. |

The elements `'procedure'`

, `'trait'`

, `'markernames'`

, `'chromosomes'`

, `'envnames'`

, `'idlfm'`

, `'idrfm'`

, `'idlfmci'`

and `'idrfmci'`

are text structures; `'positions'`

, `'waldstatistics'`

, `'prwald'`

and `'dfwald'`

are variates; `'qeffects'`

and `'qse'`

are pointers (see parameters `QEFFECTS`

and `QSE`

), as similarly are `'lowerci'`

, `'upperci'`

, `'posmin'`

, `'posmax'`

, `'posminci'`

, `'posmaxci'`

, `'idlfmci'`

and `'idrfmci'`

; `'neff'`

is a scalar.

The `SAVE`

parameter can be used to save the `REML`

save structure from the analysis for use with subsequent `VKEEP`

and `VDISPLAY`

directives.

Options: `PRINT`

, `POPULATIONTYPE`

, `NGENERATIONS`

, `NBACKCROSSES`

, `NSELFINGS`

, `VCMODEL`

, `VCPARAMETERS`

, `VCSELECT`

, `CRITERION`

, `FIXED`

, `UNITFACTOR`

,`MVINCLUDE`

, `MAXCYCLE`

, `WORKSPACE`

.

Parameters: `TRAIT`

, `GENOTYPES`

, `ENVIRONMENTS`

, `POPULATIONS`

, `UNITERROR`

, `VCINITIAL`

, `SELECTEDMODEL`

, `ADDITIVEPREDICTORS`

, `ADD2PREDICTORS`

, `DOMINANCEPREDICTORS`

, `CHROMOSOMES`

, `POSITIONS`

, `IDLOCI`

, `IDMGENOTYPES`

, `IDPARENTS`

, `QTLSELECTED`

, `INTERACTIONS`

, `DOMSELECTED`

, `DOMINTERACTIONS`

, `RESIDUALS`

, `FITTEDVALUES`

, `WALDSTATISTICS`

, `PRWALD`

, `DFWALD`

, `QEFFECTS`

, `QSE`

, `OUTFILENAME`

, `QSAVE`

, `SAVE`

.

### Method

`QMESTIMATE`

fits the following models, which include a set *L* of QTLs:

1) *y _{ij}* =

*μ*+

*E*+ Σ

_{j}_{l∈L}

*x*

_{il}^{add}*α*+

_{jl}^{add}*GE*

_{ij}if only `ADDITIVEPREDICTORS`

are specified

2) *y _{ij}* =

*μ*+

*E*+ Σ

_{j}_{l∈L}(

*x*

_{il}^{add}*α*+

_{jl}^{add}*x*

_{il}^{dom}*α*) +

_{jl}^{dom}*GE*

_{ij}if `DOMINANCEPREDICTORS`

are also specified

3) *y _{ij}* =

*μ*+

*E*+ Σ

_{j}_{l∈L}(

*x*

_{il}^{add}*α*+

_{jl}^{add}*x*

_{il}^{add2}*α*+

_{jl}^{add2}*x*

_{il}^{dom}*α*) +

_{jml}^{dom}*GE*

_{ij}if both `ADD2PREDICTORS`

and `DOMINANCEPREDICTORS`

are specified (for population type `CP`

)

where *y _{ij}* is the trait value of genotype

*i*in environment (or population)

*j*,

*E*is the environment (or population) main effect,

_{j}*x*are the additive genetic predictors of genotype

_{il}^{add}*i*for locus

*l*, and

*α*are the associated effects. In models 2 and 3,

_{jl}^{add}*x*are the dominance genetic predictors, and

_{il}^{dom}*α*are the associated effects. In model 3,

_{jl}^{dom}*x*are the additive genetic predictors for maternal genotype

_{il}^{add}*i*at locus

*l*,

*x*are the additive genetic predictors for paternal genotype

_{il}^{add2}*i*, and

*α*and

_{jl}^{add}*α*are the associated effects. Genetic predictors are genotypic covariables that reflect the genotypic composition of a genotype at a specific chromosome location (Lynch & Walsh 1998).

_{jl}^{add2}*GE*is assumed to follow a multi-Normal distribution with mean vector 0, and a variance covariance matrix Σ, that can either be modelled explicitly (with an unstructured model) or by some parsimonious model (defined by option

_{ij}`VCMODEL`

) as described in the `VGESELECT`

procedure.### Action with `RESTRICT`

Restrictions are not allowed.

### Reference

Lynch, M. & Walsh, B. (1998). *Genetics and Analysis of Quantitative Traits*. Sinauer Associates, Sunderland, MA.

### See also

Procedures: `QMBACKSELECT`

, `QMQTLSCAN`

, `QMVAF`

, `QFLAPJACK`

, `QREPORT`

, `VGESELECT`

.

Commands for: Statistical genetics and QTL estimation.

### Example

CAPTION 'QMESTIMATE example'; STYLE=meta SPLOAD [PRINT=*] '%GENDIR%/Examples/F2maize_traits.gsh' & '%GENDIR%/Examples/F2maizemarkers.GWB'; SHEET='LOCI' & '%GENDIR%/Examples/F2maizemarkers.GWB'; SHEET='ADDPREDICTORS' & '%GENDIR%/Examples/F2maizemarkers.GWB'; SHEET='DOMPREDICTORS' POINTER [MODIFY=yes; NVAL=idlocus] addpred POINTER [MODIFY=yes; NVAL=idlocus] dompred " Best variance-covariance model from VGESELECT " TEXT model; VALUE= 'fa' " Candidate QTL positions from QMBACKSELECT " VARIATE [VALUES=19,41,237] Qid VARIATE [VALUES=1,1,1] Int VARIATE [VALUES=1,1,1] Dom VARIATE [VALUES=1,0,0] DomInt QMESTIMATE [PRINT=summ,model,wald,eff; POPULATIONTYPE=F2; VCMODEL=#model]\ TRAIT=yld; ENVIRONMENTS=E; GENOTYPES=G;\ CHROMOSOMES=mkchr; POSITIONS=mkpos; MKLOCI=marker;\ IDLOCI=idlocus; ADDITIVEPREDICTORS=addpred;\ DOMINANCEPREDICTORS=dompred;\ QTLSELECTED=Qid; INTERACTIONS=Int;\ DOMSELECTED=Dom; DOMINTERACTIONS=DomInt;\ QEFF=Qeff; QSE=Qse; QSAVE=Output;\ OUTFILE='F2maize_qmestimate'