Performs a genome-wide scan for QTL effects (Simple and Composite Interval Mapping) in multi-environment trials or multiple populations (M.P. Boer, M. Malosetti, S.J. Welham & J.T.N.M. Thissen).

### Options

`PRINT` = string tokens |
What to print (`summary` , `progress` , `model` , `components` , `effects` , `means` , `stratumvariances` , `monitoring` , `vcovariance` , `deviance` , `Waldtests` , `missingvalues` , `covariancemodels` ); default `summ` |
---|---|

`PLOT` = string token |
Whether to plot the profile along the genome (`profile` ); default `prof` |

`POPULATIONTYPE` = string token |
Type of population (`BC1` , `DH1` , `F2` , `RIL` , `BCxSy` , `CP` ); must be set |

`ALPHALEVEL` = scalar |
Defines a genome-wide significance level to calculate the threshold; default 0.05 |

`VCMODEL` = string token |
Specifies the variance-covariance model for the set of environments or populations (`identity` , `diagonal` , `cs` , `hcs` , `outside` , `fa` , `fa2` , `unstructured` ); default `cs` for multi-environment trials, and `diagonal` for multiple populations |

`VCPARAMETERS` = string token |
Whether to re-estimate the variance-covariance model parameters (`estimate` , `fix` ); default `esti` |

`QTLMODEL` = string token |
Type of QTL model (`q` , `qqe` ); default `qqe` |

`COFACTORS` = variate |
Index numbers of loci to be used as cofactors for the genetic background |

`COFWINDOW` = scalar |
Specifies a window for cofactor exclusion from the model; default 10^{6} which means that all cofactors on the same chromosomes are excluded |

`THRMETHOD` = string token |
Which method to use to calculate the threshold for QTL detection (`bonferroni` , `liji` , `given` ); default `liji` |

`THRESHOLD` = scalar |
Threshold value for test statistic when `THRMETHOD=given` |

`DISTANCE` = scalar |
Distance between loci when `THRMETHOD=bonferroni` ; default 4 |

`FIXED` = formula |
Formula with extra fixed terms |

`UNITFACTOR` = factor |
Saves the units factor required to define the random model when `UNITERROR` is to be used |

`STATISTICTYPE` = string token |
Which test statistic to plot and save using the `STATISTICS` parameter (`wald` , `minlog10p` ); default `minl` |

`COLOURS` = scalar, variate or text |
Colours to use for the chromosomes; default `*` uses the colours of pens 1, 2 up to the number of chromosomes |

`TITLE` = text |
General title for the plot |

`YLOWERTITLE` = text |
Title for the y-axis of the lower graph; default `'Environments'` for multi-environment trials, and `'Populations'` for multiple populations |

`YUPPERTITLE` = text |
Title for the y-axis of the upper graph; default uses the identifier of the `STATISTICS` variate or pointer |

`XTITLE` = string |
Title for the x-axis; default `'Chromosomes'` |

`MVINCLUDE` = string tokens |
Whether to include units with missing values in the explanatory factors and variates and/or the y-variates (`explanatory` , `yvariate` ); default `expl` , `yvar` |

`MAXCYCLE` = scalar |
Limit on the number of iterations; default 100 |

`WORKSPACE` = scalar |
Number of blocks of internal memory to be set up for use by the `REML` algorithm; default 100 |

### Parameters

`TRAIT` = variates |
Quantitative trait to be analysed; must be set |
---|---|

`GENOTYPES` = factors |
Genotype factor; must be set |

`ENVIRONMENTS` = factors |
Environment factor; must be set for a multi-environment trial |

`POPULATIONS` = factors |
Population factor; must be set for a multiple-population analysis |

`UNITERROR` = variate |
Uncertainty on trait means (derived from individual unit or plot error) to be included in QTL analysis; default `*` i.e. omitted |

`VCINITIAL` = pointers |
Initial values for the parameters ofthe variance-covariance model |

`ADDITIVEPREDICTORS` = pointers |
Additive genetic predictors; must be set |

`ADD2PREDICTORS` = pointers |
Second (paternal) set of additive genetic predictors |

`DOMINANCEPREDICTORS` = pointers |
Dominance genetic predictors |

`CHROMOSOMES` = factors |
Chromosomes corresponding to the genetic predictors; must be set |

`POSITIONS` = variates |
Positions on the chromosomes corresponding to the genetic predictors; must be set |

`IDLOCI` = texts |
Labels for the loci |

`IDMGENOTYPES` = texts |
Labels for the genotypes corresponding to the genetic predictors |

`IDEFFECTS` = texts |
Labels for the effects along the y-axis, in the frame below the profile plot |

`IDPARENTS` = texts |
Labels to use to identify the parents |

`QSTATISTICS` = variates |
Saves test statistics for QTL effects along the genome |

`QEFFECTS` = pointers |
Saves QTL effects along the genome |

`QSE` = pointers |
Saves standard errors of the QTL effects |

`OUTFILENAME` = texts |
Name of the Genstat workbook file (`*.gwb` ) to be created |

`DFILENAME` = texts |
Name of the graphics file for the plots |

### Description

`QMQTLSCAN`

performs a genome-wide QTL scan in multi-environment trials as described by Malosetti *et al*. (2004) and Boer *et al*. (2007). Alternatively, it can analyse data from multiple populations. It uses means per genotype-environment or genotype-population combinations as phenotypic data, but weights can be attached to the means (see the `UNITERROR`

parameter and the `UNITFACTOR`

option below). The response variable must be specified by the `TRAIT`

parameter, and the corresponding environment and genotype factors must be specified by the `ENVIRONMENTS`

and `GENOTYPES`

parameters, respectively. The `POPULATIONTYPE`

option must be set to specify the population type. For a multiple-population analysis, the `POPULATIONS`

parameter should be set (to a factor) instead of `ENVIRONMENTS`

.

Molecular information must be provided in the form of additive genetic predictors stored in variates and supplied, in a pointer, by the `ADDITIVEPREDICTORS`

parameter. Non-additive effects can be included in the model by specifying dominance genetic predictors using the `DOMINANCEPREDICTORS`

parameter (e.g. in a F2 population). In the case of segregating F1 populations (outbreeders) two sets of additive genetic predictors must be specified, the maternal ones by the `ADDITIVEPREDICTORS`

parameter, and the paternal ones by the `ADD2PREDICTORS`

parameter. The corresponding map information for the genetic predictors must be given by the `CHROMOSOMES`

and `POSITIONS`

parameters. The labels for the loci can be supplied by the `IDLOCI`

parameter, and the labels for the genotypes in the marker data can be supplied by the `IDMGENOTYPES`

parameter. If `IDMGENOTYPES`

is set, the match between the genotypes in the phenotypic and in the marker data will be checked.

The QTL detection model assumes `ENVIRONMENTS`

(or `POPULATIONS`

) as a fixed term, and `GENOTYPES`

as a random term. Extra fixed effects can be specified using the `FIXED`

option. For the random genetic effects in the different environments (or populations) a multi-Normal distribution is assumed with mean vector 0 and variance-covariance matrix Σ. The VCMODEL option defines the model to use for Σ; the default for a multi-environment trial is to take compound symmetry, while for a multiple-population analysis the default is to take a diagonal variance matrix (the best model can be selected using the `VGESELECT`

procedure). Initial values for the parameters in the variance-covariance model can be defined by the `VCINITIAL`

parameter. The `VCPARAMETERS`

option controls whether variance-covariance parameters are re-estimated at each iteration (`VCPARAMETERS=estimate`

), or whether they are fixed at the initial values (`VCPARAMETERS=fix`

). The `fix`

setting can be useful to save computation time with large data sets or with more complex models.

By default the QTL model includes a separate QTL effect in every environment (or population), but it is possible to search for QTLs based only on QTL main effects by setting option `QTLMODEL=q`

. The QTL search can be performed with cofactors to control for genetic background effects (*Composite Interval Mapping*) or without cofactors (*Simple Interval Mapping*). For Composite Interval Mapping, the `COFACTORS`

option must be set to a variate containing the index numbers of the loci designated as cofactors. The `COFWINDOW`

option defines a window around a tested position within which cofactors are temporarily excluded from the model.

The `MVINCLUDE`

, `MAXCYCLE`

and `WORKSPACE`

options operate in the same way as these options of the `REML`

directive. The `UNITERROR`

parameter allows uncertainty on the trait means (derived from individual unit or plot error) to be specified to include in the random model; by default this is omitted. The `UNITFACTOR`

option allows the factor that is needed to define the unit-error term to be saved (this would be needed, for example, to save information later about the term using `VKEEP`

).

The method to define the threshold value is defined by the `THRMETHOD`

option and uses a genome-wide error rate defined by the option `ALPHALEVEL`

(default 0.05). If `THRMETHOD=given`

, a user-defined threshold value must be specified using the `THRESHOLD`

option. If `THRMETHOD=bonferroni`

, an effective number of tests is calculated using the value specified by the `DISTANCE`

option as the step size (default 4). Alternatively the `liji`

setting uses the method described by Li & Ji (2005). See procedure `QTHRESHOLD`

for details.

The `PRINT`

option specifies the output to be displayed. The `summary`

setting prints the information about the QTLs retained in the model, and the `progress`

setting shows how the scan is progressing. The other settings correspond to those in the `PRINT`

option of the `REML`

directive.

By default `QMQTLSCAN`

produces a pair of graphs: the upper one plots the test statistic associated with the effects of the genetic predictors against their position on the chromosomes, and the lower one is a heat plot showing how the statistic changes over the environments (or populations). You can suppress the plotting by setting option `PLOT=*`

. The `STATISTICTYPE`

option specifies what to plot along the y-axis of the upper plot, either the test statistic or the associated probability value (on a -log10 scale), and also defines what is saved in the variates specified by the `QSTATISTICS`

parameter. The `IDEFFECTS`

parameter can be used to label the effects, and the `IDPARENTS`

parameter can supply labels to identify the parents.

The effects of each genetic predictor and their standard errors can be saved, in pointers, by the `QEFFECTS`

and `QSE`

parameters, respectively. These pointers have 2 levels of suffixes: the first level has 1, 2 or 3 values depending on the setting of the 3 possible predictors `ADDITIVEPREDICTORS`

, `ADD2PREDICTORS`

and `DOMINANCEPREDICTORS`

; the second level has as many levels as the number of levels of the `ENVIRONMENTS`

(or `POPULATIONS`

) factor.

The `TITLE`

, `YLOWERTITLE`

, `YUPPERTITLE`

and `XTITLE`

options can specify the general title of the graph, the title of the y-axis on the lower graph(s), the title of the y-axis on the upper graph, and the title of the x-axis, respectively. The colours to use for the chromosomes in the upper graph are specified by the `COLOURS`

option using either a text of colour names or a variate of RGB values (see the `PEN`

directive for details). If `COLOURS`

is not set, the default is to use the default colours of the pens 1, 2, onwards, up to the number of chromosomes. By default, the plot is sent to the screen. However, you can supply a file for the plot, using the `DFILENAME`

parameter. You can discover the types of graphics file that are supported by running the command `DHELP`

`possible`

.

The `OUTFILENAME`

parameter can be used to write the `QSTATISTICS`

, `QEFFECTS`

and `QSE`

structures to a Genstat work book file in a sheet named `STATISTICS`

. This parameter should not contain an extension as the extension is defined automatically given as `.gwb`

.

Options: `PRINT`

, `PLOT`

, `POPULATIONTYPE`

, `ALPHALEVEL`

, `VCMODEL`

, `VCPARAMETERS`

, `QTLMODEL`

, `COFACTORS`

, `COFWINDOW`

, `THRMETHOD`

, `THRESHOLD`

, `DISTANCE`

, `FIXED`

, `UNITFACTOR`

, `STATISTICTYPE`

, `COLOURS`

, `TITLE`

, `YLOWERTITLE`

, `YUPPERTITLE`

, `XTITLE`

, `YLABEL`

, `MVINCLUDE`

, `MAXCYCLE`

, `WORKSPACE`

.

Parameters: `TRAIT`

, `GENOTYPES`

, `ENVIRONMENTS`

, `POPULATIONS`

, `UNITERROR`

, `VCINITIAL`

, `ADDITIVEPREDICTORS`

, `ADD2PREDICTORS`

, `DOMINANCEPREDICTORS`

, `CHROMOSOMES`

, `POSITIONS`

, `IDLOCI`

, `IDMGENOTYPES`

, `IDEFFECTS`

, `IDPARENTS`

, `QSTATISTICS`

, `QEFFECTS`

, `QSE`

, `OUTFILENAME`

, `DFILENAME`

.

### Method

`QMQTLSCAN`

fits the following mixed models repeatedly along the genome:

1) *y _{ij}* =

*μ*+

*E*+ Σ

_{j}_{f∈F}

*x*

_{il}^{add}*c*+

_{jf}^{add}*x*

_{i}^{add}*α*+

_{j}^{add}*GE*

_{ij}if only `ADDITIVEPREDICTORS`

are specified

2) *y _{ij}* =

*μ*+

*E*+ Σ

_{j}_{f∈F}(

*x*

_{if}^{add}*c*+

_{jf}^{add}*x*

_{if}^{dom}*c*) + (

_{jf}^{dom}*x*

_{i}^{add}*α*+

_{j}^{add}*x*

_{i}^{dom}*α*) +

_{j}^{dom}*GE*

_{ij}if `DOMINANCEPREDICTORS`

are also specified

3) *y _{ij}* =

*μ*+

*E*+ Σ

_{j}_{f∈F}(

*x*

_{if}^{add}*c*+

_{jf}^{add}*x*

_{if}^{add2}*c*+

_{jf}^{add2}*x*

_{if}^{dom}*c*)

_{jf}^{dom}+ ( *x _{i}^{add}*

*α*+

_{j}^{add}*x*

_{i}^{add2}*α*+

_{j}^{add2}*x*

_{i}^{dom}*α*) +

_{j}^{dom}*GE*

_{ij}if both `ADD2PREDICTORS`

and `DOMINANCEPREDICTORS`

are specified (for population type `CP`

)

where *y _{ij}* is the trait value of genotype

*i*in environment (or population)

*j*,

*E*is the environmental (or population) main effect,

_{j}*F*is a set of cofactors (if cofactors are included in the model), and

*x*and

_{if}^{add}*x*are the additive genetic predictors of genotype

_{i}^{add}*i*at the cofactor positions and at the tested position, respectively. The associated effects are denoted by

*c*and

_{jf}^{add}*α*for cofactors and tested position respectively. In model 2 and 3,

_{j}^{add}*x*and

_{if}^{dom}*x*are dominance genetic predictors of genotype

_{i}^{dom}*i*at the cofactor positions and at the tested position, respectively, with associated effects

*c*, and

_{jf}^{dom}*α*. In model 3,

_{j}^{dom}*x*and

_{if}^{add}*x*are the additive genetic predictors for the maternal genotype, for cofactors and tested position, respectively, and

_{i}^{add}*x*and

_{if}^{add2}*x*are the equivalent additive genetic predictors for the paternal genotype. Finally

_{i}^{add2}*x*and

_{if}^{dom}*x*are the dominance genetic predictors for the cofactors and tested position, respectively. The associated effects are given by

_{i}^{dom}*c*,

_{jf}^{add}*c*and

_{jf}^{add2}*c*for cofactors, and

_{jf}^{dom}*α*,

_{j}^{add}*α*and

_{j}^{add2}*α*for tested positions. Genetic predictors are genotypic covariables that reflect the genotypic composition of a genotype at a specific chromosome location (Lynch & Walsh 1998). The residual unexplained genetic and environmental (or population) effects are modelled by the

_{j}^{dom}*GE*term, which is assumed to follow a multi-Normal distribution with mean vector 0, and a variance covariance matrix Σ. The matrix Σ can either be modelled explicitly (with an unstructured model) or by some parsimonious models (defined by option

_{ij}`VCMODEL`

) as described in the `VGESELECT`

procedure.The procedure uses the `REML`

directive iteratively to fit the model at each chromosome position, storing the Wald statistic for hypothesis testing. The resulting Wald statistic or the associated probability value (on a -log10 scale) can be plotted to produce the well-known profile plots along the chromosomes.

### Action with `RESTRICT`

Restrictions are not allowed.

### References

Boer, M.P., Wright, D., Feng, L., Podlich, D.W., Luo, L., Cooper, M. & van Eeuwijk, F.A. (2007). A mixed-model quantitative trait loci (QTL) analysis for multiple-environment trial data using environmental covariables for QTL-by-environment interactions, with an example in maize. *Genetics*, 177, 1801-1813.

Malosetti, M., Voltas, J., Romagosa, I., Ullrich, S.E. & van Eeuwijk, F.A. (2004). Mixed models including environmental covariables for studying QTL by environment interaction. *Euphytica*, 137, 139-145.

Lynch, M. & Walsh, B. (1998). *Genetics and Analysis of Quantitative Traits*. Sinauer Associates, Sunderland, MA.

### See also

Procedures: `QMBACKSELECT`

, `QMESTIMATE`

, `QMVAF`

, `VGESELECT`

.

Commands for: Statistical genetics and QTL estimation.

### Example

CAPTION 'QMQTLSCAN example'; STYLE=meta SPLOAD [PRINT=*] '%GENDIR%/Examples/F2maize_traits.gsh' & '%GENDIR%/Examples/F2maizemarkers.GWB'; SHEET='LOCI' & '%GENDIR%/Examples/F2maizemarkers.GWB'; SHEET='ADDPREDICTORS' & '%GENDIR%/Examples/F2maizemarkers.GWB'; SHEET='DOMPREDICTORS' " best variance-covariance model from VGESELECT " POINTER [MODIFY=yes; NVAL=idlocus] addpred POINTER [MODIFY=yes; NVAL=idlocus] dompred QMQTLSCAN [PRINT=summary,progress; PLOT=profile; POPULATIONTYPE=F2;\ VCMODEL=fa; QTLMODEL=QQE;\ THRESHOLD=th; STAT=minlog; THRMETHOD=liji]\ TRAIT=yld; GENOTYPES=G; ENVIRONMENTS=E;\ CHROMOSOMES=mkchr; POSITIONS=mkpos; IDLOCI=idlocus;\ ADDITIVEPREDICTORS=addpred; DOMINANCEPREDICTORS=dompred;\ QSTATISTICS=minlog10p; QEFFECTS=Eff2; QSE=Se2;\ OUTFILE='F2maize_multi_out'