SARs NEWSLETTER,
No. 4 - January 1995WHO IS USING THE SARS?
The Samples of Anonymised Records are now being widely used throughout the UK. Almost all the UK Higher Education Institutions have registered and we have over 250 registered users at a total of 92 sites. Some of the different disciplines involved include social science/social policy (53 registered users), demography/medical (24), geography & urban studies (60) and business/computing (36). In addition, the SARs are being used for undergraduate teaching at several institutions.
Commercial users of the SARs include four central government departments, five health authorities, 16 local authorities, seven market research organisations and five quangos.
Overseas use of the SARs
At present the SARs cannot be distributed outside the UK. The possibility of providing overseas academics with remote access to the SARs is being considered by the ESRC and further information will be made available in due course. For overseas academics who wish to use the SARS, there are two possible routes:
We hope that in the longer term it will be possible for ESRC to allow overseas academics to use the SARs via national data archives.
ETHNIC HOMOGENEITY AND FAMILY FORMATION:
Clare Holdsworth
The ethnic group question from the 1991 census represents a significant advance in collating quantitative information on Britain's ethnic minorities. However, there are a number of methodological problems facing researchers who wish to utilize the results derived from the question, from either the census tables or the SARS. In particular, the structure of the ethnic group question does not incorporate the wide variety of ethnic groups within Britain, as the ethnic group classification available only distinguishes between white, eight non-white ethnic groups and a tenth miscellaneous group, Other-Other. The non-white ethnic groups available from the 1991 Census are derived from both race and ancestry. Hence the categories distinguished are related to an individual's country of origin or their ancestral country of origin. While this may be relatively unproblematic for generations of Asians born in this country, who accept 'Indian', 'Bangladeshi' and 'Pakistanis' ethnic and/or national identities, the classification used for generations of African or Caribbean descent are more problematic, especially the Black Caribbean's. The Black community had argued for the inclusion of a Black British group to identity second or greater generation Black Caribbean's, though this was rejected as it would be politically difficult to identify Black British and not British Asian. Individuals explicitly identifying themselves as Black British were therefore classified as Black Other. A further problem relates to the size of the ethnic groups involved, as the white ethnic group accounts for 94.5% of the population, hence in the one or two percent SAR samples, the numbers recorded for the remaining ethnic groups may become too small for reliable statistical analyses if broken down into a number of sub-groups.
Ethnic group of household head |
|||||||||||
| Ethnic Group of hh member | White | Black Carib | Black African | Black | Indian | Pakis- tani | Bangla- deshi | Chinese | Other Asian | Other Other | Total |
% |
% |
% |
% |
% |
% |
% |
% |
% |
% |
(100%) |
|
| White | 99.7 |
0.1 |
- |
- |
0.1 |
- |
- |
- |
- |
0.1 |
449422 |
| Black Carib | 7.2 |
91.1 |
0.6 |
0.6 |
0.2 |
- |
- |
- |
0.1 |
0.2 |
4037 |
| Black Afric | 7.0 |
2.3 |
88.8 |
0.2 |
0.5 |
0.3 |
0.3 |
- |
0.4 |
0.2 |
1733 |
| Black Other | 31.9 |
22.4 |
4.4 |
37.6 |
0.9 |
0.1 |
- |
0.3 |
1.4 |
0.9 |
1617 |
| Indian | 1.8 |
0.1 |
0.1 |
0.1 |
97.2 |
0.2 |
0.1 |
0.0 |
0.2 |
0.2 |
7793 |
| Pakistani | 0.8 |
0.1 |
- |
0.1 |
0.9 |
97.6 |
0.1 |
- |
0.2 |
0.1 |
4425 |
| Bangla - deshi | 0.3 |
- |
- |
0.2 |
0.1 |
0.2 |
99.2 |
- |
0.1 |
- |
1331 |
| Chinese | 9.8 |
0.3 |
0.1 |
0.7 |
- |
- |
- |
88.2 |
0.1 |
0.7 |
1231 |
| Other Asian | 13.3 |
0.3 |
0.1 |
0.3 |
1.7 |
2.1 |
0.1 |
0.3 |
80.9 |
1.0 |
1741 |
| Other-Other | 32.3 |
5.3 |
1.1 |
0.9 |
3.2 |
2.5 |
0.4 |
1.0 |
3.3 |
50.1 |
2697 |
Table 1: Ethnic group of household members by ethnic group of household head: 1% Household SAR (for all households with 2 or more members).
The composition of each ethnic group, particularly the Black Other group, may be
examined utilizing the household SAR, by comparing the ethnicity of either household or
family members (this analysis assumes that the household head, or whoever fills in the
census form, records the ethnicity of all household members, in multi-ethnic households).
Tables 1 and 2 illustrate this analysis at the level of the household, from a comparison
of ethnic group for all household members (Table 1) and dependent children within
households (Table 2) against the head of household's ethnic group. Both tables illustrate
a high level of homogeneity, especially among the White and Asian groups, with between
97.2% and 99.7% of household members and dependent children within each group, living in
households headed by an individual from the same ethnic group. These proportions are only
slightly lower for Black African, Black Caribbean and Chinese and Other-Asian groups.
However, this pattern is reversed for the Black Other and the Other-Other groups. The
latter are divided between households headed by individuals from either the White or
Other-Other group, while members of the Black Other are distributed between White, Black
Caribbean and Black Other household heads. This suggests that second generation Black
Caribbean's are being categorized as Black Other, or that a large number of Black Other
members come from a mixed ethnic background.
However, this household analysis is very limited, as it cannot identify parents and
children. For example, if there is more than one family unit in a household, such as a
lone mother living with her parents, then the household data cannot easily distinguish all
three generations. To incorporate this dimension, it is necessary to work at the level of
the family. Using family level data it is possible to identify family members, i.e. to
distinguish parents from children. To date we have only done this for married or lone
parents, and have not extended the analysis to cohabiting parents.
Ethnic group of household head |
|||||||||||
| Ethnic Group of dependent child | White | Black Carib | Black African |
Black Other | Indian | Pakis- tani | Bangl- deshi | Chinese | Other Asian | Other Other | Total Number of children |
| % | % | % | % | % | % | % | % | % | % | (100%) | |
| White | 99.5 | 0.1 | - | - | 0.1 | - | - | - | - | 0.2 | 107785 |
| Black Carib. | 7.2 | 90.4 | 0.7 | 1.0 | 0.5 | - | - | - | - | 0.3 | 1034 |
| Black Afric. | 5.4 | 3.1 | 89.7 | - | 0.5 | 0.2 | 0.3 | - | 0.7 | 0.2 | 609 609 |
| Black Other | 39.7 | 26.3 | 5.6 | 24.2 | 1.1 | 0.2 | - | 0.4 | 1.5 | 1.0 | 927 |
| Indian | 0.8 | - | 0.2 | - | 98.2 | 0.1 | 0.2 | - | 0.2 | 0.3 | 2650 |
| Pakistani | 0.8 | 0.1 | - | 0.2 | 0.7 | 97.8 | 0.2 | - | - | 0.2 | 2094 |
| Bangla -deshi | 0.1 | - | - | 0.4 | - | 0.1 | 99.3 | - | - | - | 700 |
| Chinese | 2.6 | 0.3 | - | - | - | - | - | 95.9 | - | 1.2 | 345 |
| Other Asian | 7.7 | 0.4 | - | 0.2 | 3.0 | 5.0 | 0.2 | 0.2 | 81.7 | 1.6 | 496 |
| Other Other | 40.9 | 8.2 | 1.9 | 1.4 | 5.0 | 4.1 | 0.6 | 1.5 | 4.8 | 31.6 | 1350 |
Table 2: Ethnic group of all dependent children by ethnic group of household head: 1% Household SAR.
Our first analysis is for marital unions and Table 3 gives the number of unions
occurring between different ethnic groups. The number of cross-ethnic marriages is small,
1.1% of all unions in the household sample. Most cross-ethnic marriages involve one white
partner, which is not surprising given that 94.6% of the population are white. Of these
white-other ethnic unions, 17.9% are between White and Black Caribbean men and women,
while 14.3% are between White and Indian individuals. Further, there are more Indian women
married to Black Caribbean or Indian men, than there are White men married to Black or
Indian women.
Moving on to consider the children of these marriages, Table 4 illustrates the proportion
of children living with parents of the same or different ethnic group from families beaded
by a married couple or a lone parent. The table is for all children living in these
families and cannot identify adopted or stepchildren. There is also no control for age, so
many of the lone parent families do not fit the stereotypical image of such families,
especially among the Asian groups, as they will include divorced or widowed lone parents
living with never-married children. Looking first at the bold figures in Table 4, for
children living with married parents, by far the majority of children from all ethnic
groups live with parents of the same ethnic group, except for children from the Black
Other and Other-Other groups. Within these two groups, more children (66.7% and 55.3%
respectively) are reported as living with parents of different ethnic group. For lone
parent families, the pattern is repeated with over two-thirds of Other-Other and Black
Other children living with a lone parent from a different ethnic group. Considering the
lower row of figures (italicized), it is interesting to note the large proportion of Black
children (across all groups) living with a lone parent. Over half of Black Caribbean and
Black Other children from the sample fall into this category. While 10% of Asian children
(all groups) live with lone parents, the corresponding figure for white children is just
under 20%. (It should be noted that around 90% of Asian lone parents are either married,
divorced or widowed.)
Ethnic group of wife |
|||||||||||
| Ethnic group of husband | White | Black Carib | Black African | Black Other | Indian | Pakis- tani | Bangla- deshi | Chinese | Other Asian | Other Other | Total |
| White | 115109 | 79 | 36 | 43 | 63 | 10 | - | 73 | 137 | 111 | 115661 |
| Black Carib | 135 | 511 | 7 | 9 | 4 | 1 | - | 1 | 2 | 8 | 678 |
| Black Afric | 41 | 14 | 189 | 4 | 2 | 1 | - | - | - | 2 | 253 |
| Black Other | 51 | 3 | 2 | 52 | 1 | - | - | - | 2 | - | 111 |
| Indian | 108 | 2 | 3 | 1 | 1754 | 16 | - | 5 | 4 | 4 | 1897 |
| Pakistani | 34 | - | - | 1 | 6 | 771 | - | - | 4 | 3 | 819 |
| Bangla- deshi | 4 | - | 2 | - | 4 | 1 | 213 | - | - | 2 | 226 |
| Chinese | 33 | - | - | - | 2 | - | - | 227 | - | - | 262 |
| Other Asian | 51 | 4 | - | 1 | 3 | 4 | 1 | 2 | 291 | 6 | 363 |
| Other Other | 188 | 2 | - | 2 | 6 | 4 | - | 2 | 5 | 185 | 394 |
| Total | 115754 | 615 | 239 | 113 | 1845 | 808 | 214 | 310 | 445 | 321 | 120664 |
Table 3: Number of marriages between ethnic groups: I % Household SAR (cells relate to number of unions, not individuals).
Difference between children's and parents' ethnic group
| Children living with married parents |
Children living with lone parent |
|||||
| Child's ethnic group |
Same as both par. % |
Diff fr. 1 par. % |
Diff fr. both par. % |
Same as parent % |
Diff fr. parent % |
Total in ethnic group |
| White | 99.4 79.5 |
0.6 0.5 |
- - |
99.6 19.8 |
0.4 0.1 |
145898 |
| Black Caribbean | 86.6 38.1 |
8.0 3.5 |
5.3 2.4 |
92.3 51.7 |
7.7 4.3 |
1445 |
| Black African | 81.9 50.6 |
8.9 5.5 |
9.2 5.7 |
93.0 35.6 |
7.0 2.7 |
672 |
| Black Other | 15.4 7.0 |
17.9 8.2 |
66.7 30.6 |
22.3 22.3 |
77.7 42.1 |
1024 |
| Indian | 97.1 87.2 |
2.2 2.0 |
0.7 0.6 |
97.1 10.0 |
2.9 0.3 |
3370 |
| Pakistani | 97.8 86.7 |
1.8 1.6 |
0.5 0.4 |
96.3 10.9 |
3.7 0.4 |
2408 |
| Bangladeshi | 99.9 90.4 |
- - |
0.1 0.1 |
100.0 9.4 |
- - |
763 |
| Chinese | 95.9 83.2 |
1.4 1.2 |
2.7 2.4 |
94.6 12.6 |
5.4 0.7 |
423 |
| Other Asian | 80.0 67.9 |
7.2 61 |
12.8 10.9 |
82.6 12.5 |
17.4 2.6 |
608 |
| Other Other | 28.7 18.7 |
16.0 10.4 |
53.3 35.9 |
33.4 11.7 |
66.6 23.3 |
1495 |
| Total 100% | 123821 | 1081 | 1182 | 31017 | 1006 | 158107 |
Table 4: Difference between children's and parents' ethnic group for all children living with married or lone parents, 1% Household SAR: figures in bold are for distribution within each family type, figures in italic for distribution of all children living in either family type.
Finally to complete the analysis, we have selected the Black Other children and examined their parents' ethnic group, see Table 5 and 6. Table 5, for children living with married parents, illustrates how heterogeneous this group is, as no one pattern emerges, though the largest group is for children with two Black Caribbean parents, while those children with a Black Caribbean father and white mother come a close second. Among children living with one parent (Table 6), almost half are living with a white parent, with the majority of the remaining children evenly divided between Black Caribbean and Black Other parents.
In conclusion, it appears that the composition of households and families among the other ethnic groups remains relatively homogenous, especially among the main Asian groups. Although there is some evidence of the Black British effect among the black Other group, this latter group is very heterogeneous, with almost as many Black Other children living with at least one white parent, as with two black parents. Further, there are more Black Caribbean children than Black Other children, and the majority among the former group live with Black Caribbean parents. This group, therefore, continues to represent the largest number of second generation Black Caribbean's. Finally, there are very few marriages, or children of mixed ethnic group, between the Asian and Black subgroups, especially among the Bangladeshis.
Ethnic Group of Mother |
||||||||
| Ethnic group of Father | White |
Black Carib |
Black African |
Black Other |
Indian |
Chinese |
Other Asian |
Total |
White |
38 |
20 |
9 |
19 |
4 |
2 |
2 |
94 |
Black Carib |
81 |
87 |
- |
14 |
8 |
- |
- |
190 |
Black Afric |
23 |
10 |
3 |
1 |
- |
- |
- |
37 |
Black Other |
38 |
5 |
3 |
72 |
1 |
- |
1 |
120 |
Indian |
6 |
- |
- |
- |
3 |
- |
1 |
10 |
Chinese |
1 |
- |
- |
- |
- |
- |
- |
1 |
Other Asian |
- |
- |
- |
- |
- |
- |
13 |
15 |
Other Other |
1 |
- |
- |
- |
- |
- |
- |
1 |
Total |
189 |
122 |
15 |
108 |
16 |
2 |
17 |
469 |
Table 5: Ethnic group of parents of BIack other children, for children living with married couples only (cells refer to number of children).
Ethnic group of parent |
Number of Black other Children |
Total Number of Children |
% |
(100%) |
|
White |
48.1 | 267 |
Black Caribbean |
20.9 |
116 |
Black African |
5.0 |
28 |
Black Other |
22.3 |
124 |
Indian |
0.5 |
3 |
Pakistani |
0.4 |
2 |
Chinese |
0.4 |
2 |
Other Asia |
0.7 |
4 |
Other Other |
1.6 |
9 |
Total |
555 |
Table 6: Ethnic group of parent for all Black other children living with lone parents: 1% Household SAR.
DESIGN FACTORS IN THE 2% INDIVIDUAL SAR FOR GB
We now have additional information on design factors in the Individual SAR that supplements that given in the second edition of the SARs User Guide.
Results from a second, more detailed method of calculation have just become available. This method of calculation is similar to that used to produce design factors for the Household SAR and is based on a comparison of individuals grouped into pairs of consecutive households, within counties. Individuals omitted within communal establishments are grouped into consecutive pairs within counties.
This method allows for clustering of individuals in the same household and also for most of the stratification present in the sample design. Since there are substantially more "degrees of freedom" involved in this method of estimating one may expect this to be a more precise method of calculation.
Generally, design factors for individual-level variables are lower than for individual-level analyses of the Household SAR and most range from 0.9 to 1.10. That for ethnic group has the largest values and those for the Individual and Household SARs are given below:
Ethnic Group |
|||
Household SAR |
Individual SAR |
||
White |
1.84 |
White |
1.01 |
Black Caribbean |
1.60 |
Black Caribbean |
1.05 |
Black African |
1.83 |
Black African |
1.08 |
Black Other |
1.51 |
Black Other |
1.07 |
Indian |
1.99 |
Indian |
1.05 |
Pakistani |
2.27 |
Pakistani |
1.11 |
Bangladeshi |
2.37 |
Bangladeshi |
1.18 |
Chinese |
1.87 |
Chinese |
1.11 |
Other Asian |
1.83 |
Other Asian |
1.10 |
Other other |
1.60 |
Other other |
1.07 |
Which set of design factors should be used?
Generally, the method described above will give more accurate design factors than the method used earlier. One of the reasons for the discrepancy between the two methods is that the method which compares values for SAR areas with those for the 100% Census data estimates the average design factor at the SAR area level whereas the new method estimates the design factor at the national level. The latter reflects not only the aspects of the sampling scheme taken account of in the former method but also the effect of stratification between SAR areas. The former method has the further limitation that it can only provide design factors for a subset of variables which have categories which are directly comparable with the 100% Census data.
A full set of design factors for each category of all the variables in the Household and Individual SAR is available from CMU, calculated using the "paired household" method. It is not possible to provide design factors below the level of Great Britain. However, it is evident that design factors at the level of the SAR area will vary from the national figures, depending upon the proportion of the population with the characteristic in question and its distribution within the population.
UPDATE ON DERIVEDVARIABLES
Family Variables
The latest variables to be added to the Household SAR are family level variables similar to those previously created at the household level. These include variables such as the number of residents in the family who are unemployed, the age of the youngest and oldest dependent child in the family.
Corrections
We have recently discovered errors in the calculation of variables DALLADLT (All adult households) and DA-LLPENS (All pensioner households). In both cases, households with zero residents and zero adults (or pensioners) were counted as 'All adult (or pensioner)' households. The errors have now been corrected on the Cray at MCC and we apologise for any inconvenience this has caused. Anyone wanting a copy of the amended algorithm should contact the CMU.
Distribution of derived Variables
The derived variables are being distributed as 'Sets', each set being available as raw data on floppy disk or accessible for network transfer from the Cray at MCC. On the Cray the files are stored in the directory:
/db/census9 1/sars/derived-vlbs
in subdirectories (named set l, set 2, etc.).
Documentation about algorithms and installation is stored in the same directory as the raw data files.
There is no charge for derived variables obtained by network transfer but a charge of £20 per set is being made for each set on floppy disk to cover the cost of the media.
The 'sets' of derived variables are made up as follows:
SET |
Individual/Household SAR |
|
1. |
QSUBGRP Qualifications subgroups OCCMAJOR SOC Major groups OCCSUBMJ SOC Sub-major groups DCOUNTY County |
Individual |
QSUBGRP Qualifications subgroups OCCMAJOR SOC Major groups OCCSUBMJ SOC Sub-major groups OCCMINOR SOC Minor groups INDUSDIV Industry SIC divisions |
Household |
|
2. |
Household level variables |
Household |
3. |
POPWGHT Population weights |
Individual |
4. |
CAMSCORE Cambridge occupational score OPPCAM Opposite sex Cambridge score |
Individual |
| CAMSCORE Cambridge occupational score OPPCAM Opposite sex Cambridge score GCLASS Goldthorpe class |
Household |
|
5. |
SOCCODE SOC Unit Groups WESCLASS Class schema - Women & Employment Survey |
Household |
6. |
LSTAGE Life stages HDEPTYPE Houshold Dependant Type HHDCOMP Household composition |
Household |
7. |
Family level variables |
Household |
8. |
Variable derived from New Eaniings Survey/ISCO
1981 International standard classifications of occupations |
Not yet available |
VERSION NUMBERS OF SAR SYSTEM FILES
The SPSS and SAS system files on the Cray machine at MCC are labelled with a version number containing a decimal point e.g. Version 3.2. The number before the decimal point reflects the version number of the raw data. The number after the decimal point is increased by one every time the system file is changed. This usually occurs when new derived variables are added to the system file but may also occur if corrections need to be made to existing derived variables.
From now on up-to-date information about changes to the SPSS system files will be available by issuing the SPSS command: DISPLAY DOCUMENTS.
Here is a summary of the changes so far:
Raw Data
Version 1 Data released from OPCS August 1993
with known errors in DISTMOVE, DISTWORK
Version 2 Data released from OPCS January 1994.
Variables DISTMOVE, DISTWORK are now correct.
Some changes to variables TRANWORK, WORKPLCE to remove inconsistencies. This
version was found to contain errors in variables QUALEVEL and QUALSUB
because information on qualifications of visitors had inadvertently been omitted. Raw data
files scrambled so that records are in a different order from Version I
Version 3 Data released from OPCS July 1994
in the same order as Version 2. Corrections have now been made to variables:
Household SAR - QUALEVEL, QUALSUB
Individual SAR - QUALEVEL, QUALSUB
Some changes have been made to DEPCHILD, PENSINHH, EARNERS to remove small
inconsistencies.
Individual 2% SAR System Files
Version 2.1 March 1994
Derived variables added:
QSUBGRP, OCCMAJOR, OCCSUBMJ, INDUSDIV, DCOUNTY
Also
a case id variable ID was computed to enable updates to be made more easily.
Version 3.2 July 1994
Derived
variables added: POPWGHT, CAMSCORE, OPPCAM
Household 1% SAR System Files
Version 2.1 March 1994
Derived
variables added:
QSUBGRP,
OCCMAJOR, OCCSUBMJ, OCCMTNOR, TNDUSDIV
Household
derived variables added:
DHRESID,
DHDEPCH, DHOLDDC, DHYNGDC, DHADULT, DHCHILD, DHPENSR, DHLTILL, DHEMP, DHECACT, DHUNEMP,
DHRETIRE, DHPSICK, DHINACT,
DHOTHER,
DHSTUDS, DHDEPS, DHOLDDEP, DHYNGDEP, DALLSTUD, DALLPENS, DALLADLT, DHDECPOS, DHDAGE,
DHDSEX, DHDCLASS
Version 3.2 July 1994
Derived
variables added:
POPWGHT,
CAMSCORE, OPPCAM, GCLASS, SOCCODE, WESCLASS LSTAG, HDEPTYPE, HHDCOMP
For
the SPSS system file a person level case id variable, HID, was computed to enable updates
to be made more easily.
Version 3.3 November 1994
Family
derived variables added:
DFRESID,
DFDEPCH, DFOLDDC, DFYNGDC, DFADULT, DFCHILD, DFPENSR, DFLTILL, DFEMP, DFECACT, DFUNEMP,
DFRETIRE, DFPSICK,
DFINACT,
DFOTHER, DFSTUDS, DFDEPS, DFOLDDEP, DFYNGDEP, DFHECPOS, DFHAGE, DFHSEX, DFHCLASS
Corrections made to DALLPENS, DALLADLT so that households with zero residents are not included.
ADDING INFORMATION FROM THE NEW EARNINGS SURVEY
A preliminary set of variables giving the mean hourly income in 1991 at the level of minor SOC have been added to the SARS. The percentage standard error and the number of cases on which the earnings information is based have also been added. Earnings are broken down by sex, fall-time/part-time working, broad age-group, and whether living in the south-east or elsewhere, as well as minor SOC.
CMU are conducting some tests to establish how best to make these data available. We are happy for others in the user community to help in this process. Anyone wishing to access the data should contact the CMU for further information. It is almost certain that further refinements will be made before these data are made generally available on the SARS.
THE NORTHERN IRELAND SARS
Elizabeth Middleton
The Northern Ireland SARs are now being distributed by the CMU, once institutions and individuals have completed the necessary licences. The data are also available on the Cray Computer at MCC. This article focuses on the content of the datasets and on their comparability with the GB SARS.
In the 1991 GB census, forms for England, Scotland and Wales differ in some small respects: in Wales there was a Welsh language question, in Scotland there was a Gaelic language question and also a question about the lowest floor level of accommodation. The census forms were all processed in the same way by the Office of Population Censuses and Surveys and the General Register Office for Scotland. However, Censuses of Population in Northern Ireland are taken under separate Northern
Ireland legislation and carried out by the Census Office for Northern Ireland.The statistical specification for the Northern Ireland SARs is as similar as possible to that of the GB SARs with due allowances for differences in census content. As for Great Britain, two files have been produced.
(i) The first is a 2% sample of individuals at private addresses and residents of communal establishments. The geographical scheme identifies ten areas, chosen so that the population in each area is greater than 120,000. The SAR areas are amalgamations of district Councils with similar
(ii) The second file is a 1% hierarchical sample of households and the individuals in those households. Here the geographical identifier is just that of Northern Ireland itself.
Sampling methods are similar to those used for the GB SARs but, while the GB samples are drawn from the 10% of census records which are fully coded, in Northern Ireland the samples are drawn from the 100% fully coded data.
As in GB, confidentiality issues are important in Northern Ireland. However, since there is no fine detail geographical information in the SARS, no additional measures to reduce the risk of disclosure were considered necessary. The same broad-banding of categories is used in both the Great Britain and Northern Ireland SARS.
Some of the differences between the GB and Northern Ireland SARs are a result of differences between the two censuses and fall into the following categories:
1 Differences in Questions on Census Form
The starting point for comparison of variables in the Northern Ireland and GB SARs is, of course, the census form. Questions on both forms are, in general, worded in the same way but there are a few exceptions. The wording of the Professional and Vocational Qualifications question is quite different. In Great Britain only qualifications normally obtained after the age of 18 were asked for, but in Northern Ireland the question was designed to obtain data on all levels of educational attainment and vocational qualification. The subject of the highest level qualification is recorded on the British form but not on the Northern Ireland form.
In Great Britain the ethnic group question was introduced for the first time in the 1991 Census. It was excluded from the Northern Ireland Census, but in keeping with long established practice the latter included a question on religion for answer on a voluntary basis. The Northern Ireland Census included some additional questions: a question on the number of children born alive in marriage, which was last asked in 1961 was reinstated; under the "economic activity" question a category was included to cover "unpaid work in a family business, including a shop or farm"; the amenities question was widened to include water supply and domestic sewage disposal. Also, and for the first time in a Northern Ireland Census, a question was included on knowledge of the Irish language. The wording of this question was identical to the wording of the Gaelic language question contained in the 1991 Census for Scotland.
2 Differences in Coding
Even when the wording of questions on the census form is the same, the way in which information is coded can vary. Information can be lost when being transferred from form to computer at the coding stage.
An example of this arises in the coding of the relationship to head of household question. While Great Britain codes this using 16 categories, Northern Ireland codes the answers to the relationship question using just 10 categories. As shown in Table 1, such relationships as "Cohabitant of son or daughter" or "Boarder/lodger" would be coded separately in Great Britain but would be classed as "Other unrelated" in Northern Ireland.
3 Differences in Definitions
The relationship to head of household is used to identify families within a household and differences exist between the census definitions of the family used in Northern Ireland and Great Britain.
In both censuses, a family unit has a maximum of two generations with the younger generation never married and having no partner or children. There is no age limit for a child. However, in Northern Ireland a cohabiting couple is not regarded as a family unit as it is in Great Britain. A household consisting of one cohabiting couple would therefore be classified in the Northern Ireland Census as a non-family household. If the couple had children the household would be classified as "lone parent with children, with others".
Again, the definition of a dependent child differs in the two censuses. In the GB Census a person aged 16-18, never married, in full time education and economically inactive is regarded as a dependent child. In the Northern Ireland Census, the equivalent age banding is 16-19. However, in the interests of harmonisation, the Census Office for Northern Ireland has changed the age banding to 16-18 for both the Northern Ireland SARs and the Northern Ireland Small Area Statistics.
Table 1. Differences in Coding
Relationship to Head of Household |
|
G.B. |
N.I. |
0 Head of Household |
0 Head of Household |
1 Spouse |
1 Spouse |
2 Cohabitant |
2 Cohabitant |
3 Son/daughter |
3 Son/daughter |
4 Child of Cohabitant |
|
5 Son/daughter-in-law |
8 Son/daughter-in-law |
6 Cohabitant of son/daughter |
|
7 Parent |
4 Parent |
8 Parent-in-law |
7 Parent-in-law |
9 Brother/sister |
5 Brother/sister |
10 Brother/sister-in-law |
|
11Grandchild |
6 Grandchild |
12 Nephew/niece |
|
13 Other related |
9 Other related |
14 Boarder, lodger etc |
|
15 Joint head |
|
16 Other unrelated |
10 Other unrelated |
4 Differences in Processing
4.1 Clerical/Computerised Allocation of Families
In Great Britain allocation of individuals to families is done using a complex computer algorithm whereas in Northern Ireland the number of families in a household is determined manually when coding from the census form.
The GB computer algorithm identifies sixty different family types and every individual in the household is allocated to a family with its particular family number and family type. The sixty family type categories are reduced to eight categories for the SARS, as shown in Table 2.
Table 2. Great Britain SARs Family Type
| 1. | Married Couple | -no children |
| 2. | -dependent child(ren) |
|
| 3. | -non-dependent child(ren) |
|
| 4. | Cohabiting Couple | - no children |
| 5. | - dependent child(ren) |
|
| 6. | - non-dependent child(ren) |
|
| 7. | Lone parent | - dependent child(ren) |
| 8. | - non-dependent child(ren) |
It is worth noting that the computer algorithm cannot identify families amongst a group of household members unrelated to the head of household and in such cases the census forms must be processed manually.
In Northern Ireland, the number of families in a household is determined at the coding stage by examination of the census form. The household is classified as a non-family household or as one of the household types shown in Table 3. The Northern Ireland census database does not hold information about which family an individual belongs to.
Table 3. Northern Ireland SARS Household Family Type
1. |
One-family household |
- married couple, no children, no others |
2. |
- married couple, no children, with others |
|
3. |
- married couple, with children, no others |
|
4. |
- married couple, with children, with others |
|
5. |
- lone parent, with children, no others |
|
6. |
- lone parent, with children, with others |
|
7. |
- lone grandparent, with grandchildren, no others |
|
8. |
- lone grandparent, with grandchildren, with others |
|
9. |
Two-family household |
- related, no others |
10. |
- related, with others |
|
11. |
- not related, no others |
|
12. |
- not related, with others |
|
13. |
Three or more family household |
In the Northern Ireland individual sample, variables such as Social Class of Family Head and Economic Position of Family Head are not available. Instead similar information is available for the Head of Household.
4.2 Calculations of distance
The Great Britain SARs hold information about the distance travelled by people to work and also the distance of move of migrants.
Calculations of distance to work are carried out in different ways for the two censuses. The Central Postcode Directory used in Great Britain was not used by the Northern Ireland Census Office but distances were calculated using the Northern Ireland grid square reference marked on the respondent's census form and the grid square reference of the employer. The employer's grid reference is known only for large employers (employing over 25 persons) and so the Northern Ireland SARs hold distance to work for people working for large employers only.
It is not possible to include a variable for the Distance of Move of Migrants in the Northern Ireland SARs and the variable describing the area of usual residence of migrants applies only to migrants from outside Northern Ireland.
4.3 Clerical/Computerized Editing and Imputation Procedures
In Great Britain a computerized editing system is used to identify inconsistencies and missing values in the data and then to impute valid, consistent answers. In Northern Ireland, only clerical processing issued. Some inconsistencies are resolved manually at the coding stage and clerical procedures give rules for handling missing answers.
The differences between these two approaches can be seen by comparing the procedures for treating missing values for the number of cars in a household. In Great Britain analysis of previous census results shows that a good indication of the number of cars in a household is given by the number of people in the household, the tenure of the household, and whether the accommodation is in a permanent or nonpermanent building. Missing values are imputed by reference to these other factors. However, in Northern Ireland the missing value would be coded as zero.
This means that when making comparisons between Northern Ireland and Great Britain it is important to understand the effects of the different imputation procedures.
HARMONISATION OF UK SARS
Census reports for the whole of the United Kingdom have not been available in the past and so a harmonized United Kingdom SAR dataset would be extremely useful. The CMU will be producing a UK wide SAR and this dataset will be available in addition to separate SARs for GB and NI. The UK SARs datasets will of course use the lowest common denominator for each variable and will therefore lose some more detailed information held in the separate country files.
OBTAINING SARs DOCUMENTATION USING GOPHER
Anyone with JANET access can obtain SARs documentation. Simply point your gopher browser software to the gopher server cs6400.mcc.ac.uk; choose USAR, option 5, on the final screen.
Accessing The Codebook and Glossary Files
After login on to the CS6400, type gopher at the $ prompt
Select the following options, in the order given to get information about the SARs
3. Midas datasets service
1. Datasets information
4. UK Census of Population
3. 1991 Sample of Anonymised Records
6. Codebook and glossary files - This tells you that the codebook and glossaries are
located in:
/db/census91/sars/codebook
Files for both NI and GB are in this directory.
Before copying the files into your own filespace it is recommended that you create a
directory to copy them into.
Entering the following commands at the $ prompt will create a directory and copy the
files:
mkdir codebook
cd codebook
cp/db/census91/sars/codebook/*.*
INITIAL RESULTS FROM THE NORTHERN IRELAND SARs
One of the most widely used question in the NI SARs is likely to be that on religion, and some initial exploratory analyses using questions are available. If you wish to receive a copy of the bar charts for 'Primary Economic Activity by Religion: Men' or 'Knowledge of Irish Language by Age Group by Religion' contact Margaret.Martin@man.ac.uk
WORKSHOPS FOR LOCAL AUTHORITIES
The Census Microdata Unit is running a programme of workshops on the analysis of the SARs and other microdata, aimed at the needs of Local Authorities. A three-part programme is planned beginning with basic analysis of microdata and moving through more complex analysis issues in workshops two and three.
WORKSHOP ONE: Basic Analysis of the SARs
Manchester: Tuesday 24th January 1995, Manchester Computing Centre (MCC)
London: Friday 27th January 1995, University of London Computing
Centre £50 per person per workshop (to include full documentation and lunch).
Each workshop will be supported by full documentation. Practicals will be based on SPSS but prior knowledge will not be assumed and principles will be applicable to other software
packages.A second workshop will be concerned with bivariate, simple/multiple regression and age standardisation techniques. Workshop three will feature cluster analysis, factor analysis and the production of social indicators. Speakers from Local Authorities, with experience of using the SARs will be invited to introduce these workshops.
For further information please contact Tracey Schofield on 0161 275 4735.
Workshop on the Samples of Anonymised Records
PROGRAMME
The Structure and Content of the SARs
Exercises using Individual file
Accessing the SARs
Exercises using Household and Individual files
Numbers are limited, so early booking is advisable, especially for the
Manchester workshop for which
there is no charge. The cost of the ULCC workshop will be £35.00.
For further details and booking form please contact CMU on 0161-275 4721.
MIDAS NEWS
As part of the ESRC/JISC/DENI 1991 Census Programme, Small Area Statistics (SAS) have been produced for Northern Ireland for the first time. The Department of Geography in the School of Geosciences at Queens University Belfast were funded under the Census Programme to assist the Census Office for Northern Ireland (CONI) to generate SAS for Northern Ireland. The aim was to make the Northern Ireland SAS as similar as possible to the SAS for the rest of Great Britain (GB) after taking into account differences in census questions, such as religion and fertility, and coding procedures.
The Northern Ireland SAS comprise 75 pre-defined tables which cover the full range of Northern Ireland Census topics and are available in a machine readable format for a wide variety of geographical areas down to Enumeration District (ED). Compared to Great Britain, there are fewer tables (75 as opposed to 86) and because all data were coded for the entire population, all tables are 100% tables. The Northern Ireland SAS contains tables relating to religion instead of ethnicity and also Irish language. For example table 9 provides a cross-tabulation of economic position by religion.
The Northern Ireland SAS are available for the following output areas: Enumeration Districts, Electoral Wards, District Councils, Parliamentary Constituencies, Health and Social Service Boards, Education and Library Boards, Postcode Sectors, Belfast Urban Area and Northern Ireland.
The Northern Ireland SAS can now be accessed on-line via MIDAS using the Census data extraction package SASPAC. Academic users wishing to access the Northern Ireland SAS will need to complete a separate individual registration form. The 1991 LBS/SAS Registration Pack produced by the Census Dissemination Unit (CDU) has been updated to include the Northern Ireland SAS registration form. Copies of 1991 LBS/SAS Registration Pack and documentation relating to the Northern Ireland SAS (for example, table and cell numbering layouts) can be obtained by contacting the MIDAS helpline (Tel: 0161-275-6109; Email: info@midas.ac.uk). The 1991 Census section of the CS6400 gopher (gopher:ll midas.ac.uk) has also been updated to include information on the Northern Ireland SAS.
USAR
The PC version of USAR is now available. It consumes between 60 and 70 Megabytes of disk space. The mode of operation is similar to the Cray and Unix versions. There is no charge for the software and the USAR manual can be accessed using the gopher, as described earlier. For further information contact the CMU.
LBSISAS for secondary areas in Great Britain
Since the last Bulletin, the CDU has received LBS/SAS for the following secondary areas in Scotland: Localities (SAS), Electoral wards (SAS), Inhabited Islands (SAS), New Towns (SAS), Regional Electoral Divisions (SAS), Health Board Areas (LBS & SAS). It is anticipated that by the time you read this article, the SAS for English Civil Parishes and Communities in Wales will also be available via MIDAS.
Special Workplace and Migration Statistics
The 1991 Special Workplace Statistics (SWS) and Special Migration Statistics (SMS) have undergone a series of quality assurance checks before being released to users. A project team under the direction of Professor Phil Rees (School of Geography, University of Leeds) were responsible for the quality assurance of the SMS. The SMS Set 1 passed all the checks, but missing flows were detected in SMS Set 2 which required the Census Offices to re-supply the data.
A similar project team directed by Dr Martin Frost (King's College London) detected a number of problems with the SWS. The most severe problem related to persons employed at home or with not fixed/not stated workplaces being incorrectly counted in flows into wards. The corrected version of the SWS are available via MIDAS.
At the time of writing, ESRC are concluding a contract with Quantime Limited for the conversion of the three interaction datasets in the 1991 SWS/SMS into Quanvert databases. It is anticipated that Quanvert, an interactive data analysis package, will provide an easy-to-use interface to the 1991 SWS/SMS. Although Quanvert offers limited statistical functions, it is possible to extract data subsets for analysis by statistical packages, such as SPSS and SAS. It is anticipated that it will possible to access the SWS/SMS using Quanvert during the first quarter of 1995. Documentation on how to use Quanvert to access and manipulate the SWS/SMS will be produced and will be automatically distributed to all registered users of the data.
ANALYSING THE SARs USING NSDStat
NSDStat is a Norwegian-based package designed for analysis of survey data on PCs. It includes quick and easy tabulation with graphical display and mapping. We are now able to supply the SARs in a export form to read into NSDStat. The 2% Individual SARs takes up 75 megabytes of disk space using NSDStat and a simple tabulation of the whole of GB takes about two minutes on a 486 PC.
Further information on the package can be obtained from Eric Tanenbaum at the ESRC Data Archive, University of Essex.
NEW STAFF
There have been several changes in the personnel of the CMU since the last newsletter.
Sue Heath has left her post as Research Officer to become a Lecturer in the Department of Sociology at the University of Manchester. She maintains close links with the unit as an honorary associate.
Our new Research Officer is Clare Holdsworth, who has joined us from The University of Liverpool where she is completing a Ph.D. in the Department of Geography. Clare's research interests are in the area of occupational health, women's work and the family. In addition to providing SARs support to our users, she is working on an ESRC funded project investigating the impact of child bearing on women's occupational attainment among different ethnic groups.
Rachel Tye has moved to the Unit from the Department of Geography and is working with Steve Simpson on the Estimating with Confidence project. Her particular research interest is in the use of population information systems by local government.
Also new to the Unit is David Kynaston. David is with us for 12 months as a placement from his B.Sc. (Sociology) course at the University of Central England. In addition to providing general assistance, he will be working on projects concerned with social grade, the New Earnings Survey and converting digitised boundaries to SAR areas.
As part of the Mapping European Family and Household Patterns project funded by the European Unit, we are pleased to have Pau Miret working with us for six months until 30th April, 1995. Before joining the Unit, Pau was a research fellow at the Centre d'Estudis Demografics in Barcelona and he has recently completed a master's thesis entitled 'Recent Changes in the Process of Constitution of the Family in Spain, 1975-1990'.