Fits a multi-layer perceptron neural network.

### Options

`PRINT` = string tokens |
Controls fitted output (`description` , `estimates` , `fittedvalues` , `summary` ); default `desc` , `esti` , `summ` |
---|---|

`NHIDDEN` = scalar |
Number of functions in the hidden layer; no default, must be set |

`HIDDENMETHOD` = string token |
Type of activation function in the hidden layer (`logistic` , `hyperbolictangent` ); default `logi` |

`OUTPUTMETHOD` = string token |
Type of activation function in the output layer (`linear` , `logistic` , `hyperbolictangent` ); default `line` |

`GAIN` = scalar |
Multiplicative constant to use in the functions; default 1 |

`NTRIES` = scalar |
Number of times to search for a good initial starting point for the optimization; default 5 |

`NSTARTITERATIONS` = scalar |
Number of iterations to use to find a good starting point for the optimization; default 30 |

`VALIDATIONOPTIONS` = variate |
Variate containing three integers to control validation for early stopping; default `*` i.e. no early stopping; default `!(10,4,16)` |

`SEED` = scalar |
Seed for random numbers to generate initial values for the free parameters; default 0 |

`MAXCYCLE` = scalar |
Maximum number of iterations of the conjugate-gradient algorithm; default 50 |

### Parameters

`Y` = variates |
Response variates |
---|---|

`X` = pointers |
Input variates |

`YVALIDATION` = variates |
Validation data for the dependent variates |

`XVALIDATION` = pointers |
Validation data for the independent variates |

`FITTEDVALUES` = variates |
Fitted values generated for each y-variate by the neural network |

`OBJECTIVE` = scalars |
Value of the sum of squares objective function at the end of the optimization |

`NCOMPLETED` = scalars |
Number of completed iterations of the conjugate-gradient algorithm |

`EXIT` = scalars |
Saves the exit code |

`SAVE` = pointers |
Saves details of the network and the estimated parameters |

### Description

A neural network is a method for describing a nonlinear relationship between a response variate supplied here by the `Y`

parameter, and a set of input variates supplied here in a pointer by the `X`

parameter. The type of neural network fitted by `NNFIT`

is a fully-connected feed-forward multi-layer perceptron with a single hidden layer. This network starts with a row of nodes, one for each input variable (i.e. x-variate), which are all connected to every node in the hidden layer. The nodes in the hidden layer are then all connected to the output node in the final, output layer. The number of nodes in the hidden layer is specified by the `NHIDDEN`

option.

The output value *y* is given by

*y* = ψ( Σ_{k = 1…m} *w _{k}* φ( Σ

_{j = 1…d}

*w*

_{jk}*x*– θ) – η)

_{j}where d |
is the number of input nodes (i.e. x-variates), |
---|---|

m |
is the number of hidden nodes (`NHIDDEN` ), |

x_{j} |
is value of the jth x-variate, |

w_{jk} |
are weight parameters in the connections between the nodes in the input and hidden layers, |

w_{k} |
are weight parameters in the connections between the nodes in the hidden and output layer, |

θ | is the threshold value subtracted at the hidden layer, |

η | is the threshold value subtracted at the single node in the output layer, |

φ(.) | is the activation function applied at the hidden layer, |

ψ(.) | is the activation function applied at the output layer. |

The activation functions for the hidden and outer layer are specified by the `HIDDENMETHOD`

and `OUTPUTMETHOD`

options, respectively, with settings:

`linear` |
φ(z) = z (`OUTPUTMETHOD` only), |
---|---|

`logistic` |
φ(z) = 1 / (1 + exp(-γz)), |

`hyperbolictangent` |
φ(z) = tanh(γz), |

where the parameter γ is specified by the `GAIN`

option; the default setting is `logistic`

for `HIDDENMETHOD`

, and `linear`

for `OUTPUTMETHOD`

.

Values for the free parameters in the multi-layer perceptron model are optimized by using a preconditioned, limited-memory quasi-Newton conjugate gradients method to minimize the objective (sum of squares) function equal to 0.5 times the average sum of squared deviation of the estimated y-values from the observed y-values.

Printed output is controlled by the `PRINT`

option, with settings:

`description` |
a description of the network (number of input variables, nodes etc.), |
---|---|

`estimates` |
estimates of the free parameters, |

`fittedvalues` |
fitted values, |

`summary` |
summary (numbers of iterations, objective function etc.). |

The `NTRIES`

option defines the number of times to search for a good initial starting point for the optimization (default 5). The `NSTARTITERATIONS`

option defines the number of iterations to use to find a good starting point for the optimization (default 30).

The `SEED`

option supplies a seed for the random numbers to generate initial values for the free parameters. The default of zero continues the existing sequence of random numbers if any have already been used in the current Genstat job. If none have yet been used, Genstat picks a seed at random.

The `MAXCYCLE`

option sets a limit on the number of iterations of the conjugate-gradient algorithm to use for the estimation (default 50).

To improve the accuracy of the neural-network approximations to new data records, it is usually desirable to stop the optimization before the value of the objective function reaches a global minimum on the training set. This method, which is known as *early stopping*, and can be performed by using a validation set of data records, specified by the `YVALIDATION`

and `XVALIDATION`

parameters. The optimization is then halted when the sum of squares error function achieves a minimum over the validation set of data records which has not been used to estimate the values of the free parameters in the model. The `VALIDATIONOPTIONS`

option specifies a variate containing three integers to control validation for early stopping. The first integer defines the number of iterations of the optimizing function to complete before beginning validation; default 10. The second integer defines the number of iterations between consecutive validations; default 4. The third integer defines the number of iterations to continue validating beyond the current minimum of the objective function before stopping; default 16. This is to try to avoid the possibility of getting stuck at a local minimum. The variates in the `XVALIDATION`

pointer must be in the same order as the corresponding variates in the `X`

pointer.

The results of the fit, together with details about design of the neural network, can be saved using the `SAVE`

parameter. This can then be used in the `NNDISPLAY`

directive to display further output, or the `NNPREDICT`

directive to form predictions.

Options: `PRINT`

, `NHIDDEN`

, `HIDDENMETHOD`

, `OUTPUTMETHOD`

, `GAIN`

, `NTRIES`

, `NSTARTITERATIONS`

, `VALIDATIONOPTIONS`

, `SEED`

, `MAXCYCLE`

.

Parameters: `Y`

, `X`

, `YVALIDATION`

, `XVALIDATION`

, `FITTEDVALUES`

, `OBJECTIVE`

, `NCOMPLETED`

, `EXIT`

, `SAVE`

.

### Method

`NNFIT`

uses the function `nagdmc_mlp`

from the Numerical Algorithms Group’s library of Data Mining Components (DMCs), which estimates the free parameters using a conjugate gradient method.

### Action with `RESTRICT`

You can restrict the set of units used for the estimation by applying a restriction to the y-variate or any of the x-variates. If several of these are restricted, they must all be restricted to the same set of units. Similarly, you can restrict the set of units used for the validation by applying a restriction to the `YVALIDATION`

variate or any of the `XVALIDATION`

variates.

### See also

Directives: `NNDISPLAY`

, `NNPREDICT`

, `ASRULES`

, `RBFIT`

.

Procedure: `KNEARESTNEIGHBOURS`

.

Commands for: Data mining.

### Example

" Example NNFI-1: Fitting a multi-layer perceptron neural network." " This example fits a multi-layer perceptron neural network with five hidden layers, a hyperbolic activation function in the hidden layer and a linear activation function in the output layer." " The data are in a file called iris.GSH and contain the data from Fisher's Iris data set." SPLOAD [PRINT=*] '%GENDIR%/Data/iris.GSH' POINTER [VALUES=Sepal_Length,Sepal_Width,Petal_Length,Petal_Width] Measures CALC yval = NEWLEVELS(Species) NNFIT [PRINT=description,estimates,summary; NHIDDEN=5;\ HIDDENMETHOD=hyperbolictangent; OUTPUTMETHOD=linear; SEED=12]\ Y=yval; X=Measures