Samples of Anonymised Records from the 1991 Census
CCSR Faculty of Social Sciences and Law University of Manchester Manchester M13 9PL
 
UPDATE ON 2001 SARS
ONS have not yet reached a decision over our request for SARs from the 2001 Census. We expect to hear by the end of March. Meanwhile, we are working on the assumption that our request will be accepted and that 2001 SARs will provide additional value to researchers and policy analysts. In particular, we expect that the Individual SAR will be able to support much more focussed research to meet the needs of central and local government.
New web-site for SARs
When the 1991 SARs arrived in summer 1993 the web was in its infancy and all our documentation on the SARs was designed as hard copy and then imported to the web as it developed. With the 2001 SARs we have the opportunity to ensure that our dissemination and user support takes full advantage of the web.
In November 2001 Sam Smith joined the SARs team and, with Jo Wathan, Ed Fieldhouse and Mark Brown, has been working hard on implementing a new design.
Key features of the design are:
The web pages will also link to teaching and learning materials for the SARs.
Registration for the SARs
Registration for the 1991 SARs was acknowledged by all concerned to be tedious and off-putting. Everyone is committed to providing a much more stream-lined service for 2001 and, retrospectively, for 1991 SARs. The Data Archive has been contracted to provide a 'one-stop shop' for registration of all census products and this should be operational by September 2002. Users should be able to register on-line without the need for forms to be counter-signed by site reps or course tutors. Teachers have also made clear the importance of on-line registration for students. To help inform developments the Data Archive have a web-based questionnaire. It is designed to be completed by university representatives of Census data services, including the SARs. We should like to encourage all SARs representatives to visit the site, as well as all teachers who envisage using census data. Your comments will be most welcome.The registration system being developed by the Data Archive will be based on Athens authentication - widely used across the higher education sector for access to many databases.
For non-academic users we envisage that this will be adapted to allow registered users to download SARs from the CCSR website. We will also be able to supply SARs on CDROM where this is preferred.
Costs for non-academic users
The final decisions over costs of the SARs have not yet been made. They are consequent upon further discussions between ONS and ESRC and final agreement over the specification of the 2001 SARs.
We recognise that this causes difficulties for organisations that need to bid for money to purchase the SARs. Our advice is that the 2001 SARs will not cost more than:
£500 per file for central and local government
£1000 per file for the business sector
Organisations that purchase the data will be provided with full supporting documentation, all derived variables and a free place on a SARs training course.
What have we learned from the SARs?
On 16 November 2001, we held a conference at the University of Manchester at which we heard about some of the most innovative uses of the SARs. One of the papers, by Malcolm Macourt, is reproduced in full. Below, Yaojun Li has compiled a summary of the other presentations. All papers based on the SARs are listed in SAR publications .
The conference was designed to showcase some of the most important and innovative research findings from the SARs. Researchers from Manchester and many other universities took part in the conference to exchange ideas and research findings.
Mike Coombes and Simon Raybould from the University of Newcastle gave a talk on 'Measuring disadvantage and difference: The role of housing in disadvantage and ill-health'. They examined the ways in which housing interacts with other socio-economic factors to create distinct forms of multiple deprivation in Britain today. They showed that the SARs can be used to develop new approaches to measuring multiple deprivation. The results provide insights which cannot be obtained from other Census datasets or from survey data in terms of understanding the links between ill-health and housing.
Paul Boyle's talk (School of Geography and Geosciences, University of St Andrews) was entitled 'Which influences health most - country of birth or country of residence? a British analysis using individual-level data'. The rates of limiting long-term illness, as determined from the 1991 Census, are higher in Wales and Scotland than in England. The question is whether these differences are consistent when the socio-economic and demographic characteristics of the residents in each of these countries are controlled for, hence whether country of birth is more important in explaining limiting long-term illness than country of residence, once individual characteristics have been held constant. He then compares the rates of limiting long-term illness between those who are born in one country but reside in another with those who were born in, and who reside in, the same country, again controlling for individual circumstances. A number of health-related issues are raised by work of this type, including the roles of genetics, environment, and cultural attitudes to illness.
Edward Fieldhouse and Mark Tranmer's paper on 'Labour market and neighbourhood variations in male unemployment risk using census microdata' used multilevel modelling techniques with the area classification on the Individual SAR to investigate geographical differences in unemployment. Previous research has indicated that locality affects people's risks to unemployment. However, the understanding of the relationships is confounded by the reciprocal nature of the relationship between unemployment, housing and geographical location. The authors examined the relative importance of individual characteristics, neighbourhood types and the local labour market in explaining variations in unemployment risk. They also examined the role of housing tenure at individual and contextual levels in mediating this relationship. Two competing hypotheses were evaluated. The first is that local concentrations of unemployment are the result of the process of neighbourhood selection. The second suggests that there are contextual effects on unemployment risk, which may include access to job opportunities and 'concentration effects'. The study concluded that most neighbourhood level variation in unemployment is due to housing market effects, particularly neighbourhood selection. As well as offering insights into the relationship between unemployment and geographical location, the research demonstrated methodological innovations in the analysis of census microdata. In particular it showed how area classifications can be used in conjunction with microdata in a multilevel modelling framework, to get a better understand of the role of individual and contextual factors in social processes.
Mark Tranmer also gave a short introduction to an ESRC-funded project on 'Combining aggregate and micro-data to extend census tables for local area use'. Even though aggregate census tabular output provides population level information for identified geographical areas ranging from local authority districts to ward-level and even down to output area level, the very fact that the data represent population counts for very small areas means that there are constraints on the amount of information that can be released in any given table. If tables contain more than three or four dimensions, there is a danger that small cell sizes may lead to confidentiality risks. Tranmer outlined some methods whereby information from the samples of microdata can be combined with the aggregate statistics to extend standard tables from the census. The methodology can also improve the accuracy and precision of estimates made at the local level. Both logistic and multilevel-logistic models were presented in theoretical terms, and some preliminary empirical results given.
Paul Williamson from the University of Liverpool gave a paper on 'Extending the outputs from the 2001 Census: Taking account of place when imputing income in the SARs'. He started by noting the conspicuous lack of income information in the SARs. Attempts to overcome this problem, involving either proxy variables or imputation based on occupation, have paid little or no attention to the spatial variability of income below a regional level. He has conducted an analysis of Census Rehearsal data, collected in 1999, and found that place does have a role to play in determining individual and household incomes. Looking forwards to the 2001 Census, Williamson offered some guidelines on the best strategies for imputing income at a (near) ward level, anticipating the release of a third SAR: the spatially detailed SAM.
Dimitris Ballas and colleagues from the University of Leeds gave a paper on 'Spatial microsimulation approaches to combining the SARs with other Census outputs and surveys microdata.' They showed how the SARs can be combined with other datasets in an object-oriented spatial microsimulation context to produce spatially disaggregated population microdata. In particular, they described different static microsimulation approaches to combining the SARs with data from the small area statistics (SAS). These include Iterative Proportional Fitting (IPF) based static microsimulation techniques which aim at spatially disaggregating the SARs at the small area level, a Simulated Annealing (SA) based approach to reweighting the SARs in order to estimate small area microdata, and a new dynamic microsimulation framework which aims at combining data from the SARs, the SAS and the British Household Panel Survey (BHPS) in order to dynamically stimulate urban and regional populations. They outlined the aims and objectives of a dynamic spatial microsimulation model for the city of York, based on the new framework to simulate dynamically the population of York under different policy scenarios. This model involved the generation of a spatially disaggregated micro-database for York at the enumeration district and postcode level. The database will be further enriched and projected into the future with the use of BHPS data. Thus, microsimulated 1991 SAR households will be linked to their closest BHPS counterparts and will be projected into 2001 and beyond. They also considered other methodologies (e.g. probabilistic event modelling, agent-based approaches etc.) to update the simulated micro-population. Finally, the difficulties in calibrating and validating this kind of modelling exercises were highlighted and ways to tackle them explored.
Finally, Seraphim Alvanides and his colleagues from the Universities of Newcastle and Leeds gave a talk on 'Modelling the geographical location of synthetic households'. Spatial microsimulation produces estimates of populations and household microdata at a very fine scale (such as ED level) using a variety of methodologies and data sets (e.g. from the SARs or surveys). Although microsimulated households can be located in attribute space and within some approximation to geographical space, it would be interesting to be able to populate the small areas at a finer scale. It is possible to obtain number of households with specific microsimulated characteristics. Ballas et al. (2000) discuss the possibilities and the value of combing remotely sensed data with spatial microsimulation techniques for the generation of population microdata. In this paper they explore alternative methodologies for achieving this aim, using widely available geographical data at very fine scales. In particular, they explore the possibility of linking microsimulated households to address and postcode data, which are extremely accurate for identifying geographical locations. The limitation of very detailed datasets such as address-point data is their lack of descriptive characteristics. Further, they discuss the potential of combining a microsimulated spatial database with Ordnance Survey's Land-Line dataset. The latter has the advantages of address-point data, but provides additional descriptive information related to the type of the house. Finally, they investigated the potential of combining spatial microsimulation data with postcode data, at a coarser level of locational information, combined with geodemographic classifications to provide household attributes. In the context of all the suggested data combinations, there is a challenge to formulate a strategy for populating the properties (sub-ED level) with households derived from the microsimulation process (aggregated at the ED level). Although desirable, it might not be necessary to go down to actual addresses or delivery points. Moreover, it might be sufficient for the purpose of this research to simply create likelihood surfaces of household characteristics using the residential properties (from Land-Line or Address-point) as a starting point to create a raster representation of populated areas. This may employ Martin's (1989) Population Surface methodology using multiple points (from address-point or Land-Line) as opposed to a single polygon centroid (originating from the ED). In addition to the multiple point origins, the surface can be much more detailed using 50m, instead of 200m grid squares or even hexagons. Once the surface is created it will be possible to superimpose actual locations, in the form of addresses or full postcodes to obtain more detailed georeferencing.
The Escapers: The 'Nones' in Northern Ireland Religion
Malcolm P A Macourt, University of Northumbria
Combating illegal discrimination on grounds or gender, religion, race or ethnic background requires baseline data. That data is only useful if as many individuals as possible identify with one category or another. What happens when a sizeable proportion of the population refuses to provide relevant information? - or even refuses to accept the relevance of the categories? This issue haunts those concerned with questions of ethnicity in Great Britain.
In Northern Ireland this issue has already had to be faced in relation to 'religion' (Macourt 1978, 1995) (Southworth 1998). Whereas as recently as the 1951 census, all but 0.4% gave a 'Catholic' or 'Protestant' answer, by 1991 11% of the population had 'escaped' the community divide by answering 'NONE' or by not answering the religion question. Using data both from the 2% Individual and 1% Household Sample of Anonymised Records (SARs) and from the 3,729 enumeration districts identified in the Small Area Statistics (SAS), this research seeks to get close to those escapers. [Of those who gave a 'religious' answer, 43.15% 'catholic' and 56.85% 'protestant']
In 1991 features of the NONEs included:
The 2001 Census has sought to alleviate the problem of non-response by introducing a question on community background ('What Religion… were you brought up in?') for those who claimed to have no religion. This research shows how the 1991 data can be explored to make comparison with the 2001 data easier, by benefiting from the social, economic, linguistic and cultural geography of Northern Ireland.
'Religious' residential segregation, particularly in urban working class areas, is a key feature of the socio-political geography of Northern Ireland (Adair 2000, Anderson 1998, Doherty 1997): over one third of all Enumeration Districts were more than 95% 'catholic' or more than 95% 'protestant'. Of these segregated EDs, catholic EDs had a median NONEs of 0.59% whereas the median NONEs for protestant EDs was 5.39%.
The SAS was used to 'allocate' NONEs to 'catholic' or 'protestant' within each enumeration district (median population 423) using only the problematic assumption of equal probabilities of recorded 'religion' (McCall 1999). The outcome for NONEs was 24.22% 'Catholic' and 75.78% 'protestant', rather different from 43.15%/56.85%; furthermore it appears that 8.76% of 'catholics' did not record religion, compared with 12.78% of 'protestants'.
Reported knowledge of the Irish Language relates to an important feature of the cultural and educational history of Northern Ireland. Using the 2% individual SAR , the 'language question' was compared with the 'religion question' with striking results: 21.75% of 'catholics' (aged 18+) reported some knowledge of the Irish language compared with 0.72% of 'protestants'. Comparing only those born in Northern Ireland: just over 20% of catholics - and less than ½% of protestants. This reflects the almost universal presence of Irish in the syllabus of schools controlled by the Roman Catholic church, and its absense from the syllabus of state schools. The small 'integrated schools' sector is of recent origin.
There was a noticeable difference by occupation in knowledge of the Irish language reported by 'catholics' (O'Reilly 1999). Using the Registrar General's Social Class, of the employed population aged 16 - 64 with adequately described current or recent civilian occupations, 35.3% of 'catholics' in Class I and II knew Irish, but only 15.7% in Class IV and V. This difference is far less marked among NONEs, 6.9% and 5.2%.
Taking account of occupation and the pattern of knowledge of Irish, a detailed investigation was undertaken of the 1233 Enumeration Districts which comprise the continuous urban area around Belfast. Comparisons were drawn between the segregated EDs (159 with more than 95% 'catholic' and 432 with more than 95% 'protestant') and 177 'mixed' EDs where neither identity comprised more than 75% of those who gave a 'religious' response. This investigation demonstrated that:
The general conclusion was that almost all of those from a catholic background who answered NONE did so living outside the exclusively catholic areas - or, in the terms of the title of this paper, they had 'escaped' before they answer 'NONE'. On the other hand, very many from a protestant background who answered NONE did so from within exclusively protestant areas - the 'escape' factor was far less marked.
The age-old exchange:
'Are you a 'catholic' or a 'protestant'?', 'Neither, I am a Buddhist',
'Yes, but are you a 'protestant' Buddhist or a 'catholic' Buddhist?'
may not yet have become redundant, but this research has demonstrated that there have been ways of distinguishing 'catholic' NONEs from protestant NONEs'.
Post-graduate study at CCSR
CCSR invites applications for its post-graduate training programme:
The University of Manchester is offering 5 one-year scholarships consisting of a fee waiver and a maintenance payment of £8,300 for post-graduate study in inter-disciplinary topics using large-scale datasets. Candidates must also be applicants to ESRC for 1+3 funding and, if unsuccessful, undertake to apply again for 3-year funding.
For further details and who to contact, please check http://www.ccsr.ac.uk/postgrad.htm.
Short courses
We run a programme of short courses (from 1-day to 3-days) throughout the academic year. These are available to students at the University of Manchester and also to external delegates. A booklet containing a full list of short courses for 2002 - 2003 is included with this Newsletter. For a list of the remaining courses for 2002 and a full list for 2002/3, see our web site: http://www.ccsr.ac.uk/courses.htm.
Courses on the SARs
To provide information on the 2001 SARs and their research potential we will be running a series of courses and workshops, starting in June 2003. An outline of our plans is shown below, but we would be very pleased to hear your views on what would be most helpful.
Our current programme of short courses already provides tools for analysing the 1991 SARs:
We would be interested to hear from you at ccsr@man.ac.uk about additional courses or types of training that would be helpful.
All workshops and training session have to be run on a self-financing basis as our ESRC/JISC grant does not include this. However, we will endeavour to keep costs for introductory SARs courses as low as possible and will be providing free places to non-academic users who purchase the data.
Derived variables for 2001 SARs
We plan to produce a similar set of derived variables for 2001 SARs as for 1991. Offers from some users to generated derived variables from the 1991 SARs have already been received and we are very grateful to them.
In addition, new variables in the 2001 Census provide some new opportunities, outlined below.
South Asian ethno-religious identity
An important motivation for the inclusion of a religion question in the 2001 Census was the recognition that, for those of South Asian origin in particular, it represented a crucial aspect of individual and group identity not captured in the ethnic group classification. This was most notably the case with an Indian category that combined Hindus, Sikhs, Muslims and other religious minorities with highly contrasting cultural backgrounds. Analysis of survey data (notably the 1994 National Survey of Ethnic Minorities) has shown that as well as representing an important marker of self and group identity, the differentiation of the Indian population by religion reveals some striking differences in economic and social profiles (e.g. -Brown, M (2000) "Religion and economic activity in the South Asian population," Ethnic and Racial Studies 23, No.6). For example in the area of economic activity, there are important differences between Hindus and Sikhs, notably in job type and levels of unemployment. Meanwhile, while lower reported levels of female participation in the labour market among Indian Muslims are in line with expectations, there are many examples where Indian Muslims (many of East African origin) differ strongly from Pakistani and Bangladeshi Muslims. A SAR variable of ethno-religious identity will facilitate a much better understanding of the extent and nature of diversity in the South Asian population, which is poorly captured by either ethnic group or religion alone. It will be a useful variable for academic researchers and for those in policy research.
| Indian Hindu | Pakistani Muslim |
| Indian Sikh | Pakistani Other |
| Indian Muslim | Bangladeshi Muslim |
| Indian Christian | Bangladeshi Muslim |
| Indian Other |
Measures of social class
The 2001 SARs will contain comparable detail on occupation, coded to SOC2000, as for 1991. They will also have the new national Statistics Socio-economic Class (NS-SEC). In addition we plan to derive a set of alternative measures of social class. These will include RGClass and SEG to provide backwards compatibility with 1991 and international measures as ISCO and ISEI.
Suggestions for additional variables should be made to Yaojun Li.
Area classification for the 2001 SARs
Area classification information was added to the 1991 SARs to enhance geographical research using the data. Users are provided with a descriptor of the local area in which a sampled individual lives, in addition to the much broader area of residence identifier. The 1991 Individual SAR has 'ED type' information based on 'GB Profiles' whilst the 1991 Household SAR includes the ONS (now National Statistics) ward classification. For the 2001 SARs we hope to provide a similar classification but with a few differences. These differences reflect improved availability of small area data outside of the Census (which can be combined with SAR data) together with our experience of user requirements and the constraints imposed by confidentiality issues. The proposed format of the classification is described below and we welcome comments and feed back on these proposals.
It is envisaged that the area classification will be made available in a second addition of the SARs. To comment on these proposals please e-mail Ed Fieldhouse or fill in the consultation form on our website.
SARs Hotline
Since Autumn 2001, SARs queries have been logged and monitored on a helpdesk system. Email queries should be sent to sars-helpdesk@man.ac.uk to ensure the fastest response. The SARs helpdesk can also be reached by telephone on +44 (0) 161 275 4735.
Workshops on the SARs and other 2001 census outputs
Two workshops are planned for the following dates and locations. A generic format and content is given below. There will, however, be differences in the content and speakers for each workshop. Please check our web site for latest information. Attendance is FREE but booking is necessary. Please use the on-line form at www.ccsr.ac.uk.
| Manchester | 12 June 2002 | 1.00 pm - 4.00 pm |
| London | 14 June 2002 | 11.00 am - 3.00 pm |
Speakers include Chris Denham, Office for National Statistics, Angela Dale, CCSR, Keith Cole, MIMAS, John Stillwell, University of Leeds
Teaching and learning materials for the SARS
Report on a Consultation Workshop (CHCC project)
31 January 2002 at Manchester Computing, University of Manchester
CCSR is a partner is a project funded by Joint Information Systems Committee (JISC) to deliver census-based learning and teaching to the UK Higher Education sector. We are developing teaching and learning materials specifically for the SARs as well as contributing to a set of inter-disciplinary units that will draw on the full spectrum of census data
The development of SAR teaching materials is now well under way and we have a number of pilot units ready for testing. Their design and content has been informed by active consultation with potential users of the resource. This began with a major workshop in London last year, which gave the clear message that materials needed to be highly flexible in content and format to meet the different needs of teachers and learners.
Exactly one year on, the project has run a second workshop in Manchester to determine whether we are on the right track. Are the materials we are developing actually what people will want to use?
The day was built around two practical sessions in which participants were given hands-on experience of the pilot units developed for SARs (developed by CCSR) and those based on the census area statistics (developed by the University of Leeds).
The SARs team offered participants two methodological units (one on using graphics, one on tables) and two units of a more substantive nature (one on limiting long-term illness, one on women and employment). These adopt a flexible format that makes them suitable for classroom teaching (materials can be easily downloaded in a range of formats and customised by teachers) and self-study (materials can be worked through on-line with an interactive interface to a presentation, detailed background notes and exercises drawing on real SAR data). The sessions proved extremely useful. Many commented positively on the value of the exemplars and exercises to teachers working in both FE and HE. There was strong approval of NESTAR (software developed by the Data Archive which allows on-line data exploration of the SARs and other survey data ) as a user-friendly alternative to using SPSS as a way of accessing SAR data. Again the message was that to be useful, the materials had to be flexible and easily customised according to users particular needs. We are now looking to incorporate all the feedback from the workshop to inform further development of the materials.
A chance to get involved
The success of the project depends on the delivery of materials that meet the real needs of teachers and students. It is therefore essential that we expose these prototype units as widely as possible to the scrutiny of potential users. If you are involved in teaching in the FE and HE sector we would greatly value your feedback. Specifically we would be grateful for volunteers who are willing to pilot and evaluate any or some of our units in their own teaching. We would also welcome expressions of interest from people able to contribute to the authoring of the subject specific units.
We will continue to revise materials in the light of all feedback we receive so this is a chance to actively inform the development of a major new teaching and learning resource.
For further information visit the project website
or contact Mark Brown, 0161 275 4780
CCSR project team: Dr Mark Brown; Dr Jo Wathan; Professor Angela Dale
Other project partners are: MIMAS and the Census Dissemination Unit, the University of Manchester; The School of Geography, University of Leeds; The History Data Service, The Data Archive, University of Essex; The LTSN Centre for History, Archaeology and Classical Studies and University of Glasgow.
CCSR SEMINARS
Tuesday 19 March
Mobility & friendship patterns over 3 decades in UK
Yaojun Li, CCSR, University of Manchester
Tuesday 16 April
Patterns of Class Inequality and Social Mobility amongst Jews and Arabs in Israel.
Nabil Khattab, CCSR, University of Manchester
Tuesday 30 April
Voter engagement at the 2001General Election: a study of young people and ethnic minorities
Ed Fieldhouse & Kingsley Purdam, CCSR, Andrew Russell, Department of Government
Virinder Kalra, Department of Sociology, University of Manchester
Tuesday 14 May
Racialised Territories in Northern Towns
Debbie Phillips, Geography, University of Leeds
Tuesday 21 May
Privacy and Anonymised Data
Cate Heeney, CCSR, University of Manchester
Tuesday 28 May
Women in Maths and Science: evidence from the National Child Development Study
Shu-Li Cheng, CCRS, University of Manchester
Tuesday 11 June
Networks, recruitment and equal opportunities in the UK television industry
Val Antcliff, CCSR, University of Manchester
INTERDISCIPLINARY PERSPECTIVES ON ANALYSING THE LIFE COURSE
A series of six ESRC-funded seminars focusing on the theory, methods and practice which bring together perspectives across the range of relevant disciplines.
Seminar 5
Friday 28 June 2002, Kings College, Cambridge
Comparative Perspectives
Speakers will include:
Professor Karl Ulrich Meyer, Max Planck Institute for Human Development, Berlin
Professor Jonathan Gershuny, Institute for Social and Economic Research, University of Essex
Professor Richard Smith, University of Cambridge
To reserve a place on this free seminar, please complete the on-line booking form.