One of the major differences between Genstat and Excel is that Genstat stores its data within data structures in server memory rather than on a particular worksheet.; the Genstat spreadsheet is used to view or manipulate the data. An Excel workbook is an unstructured set of cells containing numbers, text and formulae in any order, whereas a Genstat spreadsheet displays a set of specific data structures with a particular size and attributes.
The unstructured nature of Excel can be considered as both a strength and weakness as the organization is specific to the user, and is not inherent in the spreadsheet. In Genstat the spreadsheet is column based where the data are stored in a single column for each measurement. Each column can contain data in one of three formats: numerical (a variate), textual (a text) or categorical (a factor) data.
A spreadsheet containing a text column several factor columns and a variate (no icon) |
Numerical and textual data cannot be mixed within the same column. Note that entering a numerical value into a text column will result in the numerical value being treated as text. If you attempt to enter data of an incorrect type (for example text into a numerical column) error messages will appear. The columns within a Genstat spreadsheet must all have the same number of rows, hence forming a rectangle. However, you can have multiple spreadsheet pages within a workbook, with structures of different lengths on each page.
For categorical data, Genstat has a data structure type called a factor that can be used to allocate the data into groups; for example, in the image above the column Severity is a factor with 4 groups labelled None, Mild, Moderate and Acute. The factor groups can be referred to by numbers known as levels and by text known as labels. If an Excel worksheet contains categorical columns (numerical or text) these can easily be converted to factors within Genstat. See Understanding Factors and Levels for more information about factors.