SARS NEWSLETTER

NO. 16 - SEPTEMBER, 2001

Samples of Anonymised Records from the 1991 Census

CCSR Faculty of Social Sciences and Law University of Manchester Manchester M13 9PL

 

CCSR to provide dissemination and support for Samples of Microdata from the 1991 and 2001 Census

CCSR has been awarded ESRC/JISC funding to support and disseminate the 1991 and 2001 Samples of Anonymised Records. In this newsletter we are setting out the plans for this. First we provide some details of the timetable for the 2001 SARs.

Timetable for delivery of the 2001 SARs

October 2001: By the end of October, we expect to have received ONS's response to our request to expand the Individual SAR to a 3% sample and reduce the population threshold to about 70K. ONS will also respond to proposals for variations for particular variables.

November 2001: Further feedback and consultation with the academic community will be launched at the SARs Conference on 16 November in Manchester - see page 6.

Autumn 2002: By this date, we should have a final specification of SARs agreed with ONS and production of the SARs 2001 will begin.

Summer 2003: ONS are working to a timetable that delivers the SARs by summer 2003. Upon delivery a rapid programme of checking and validation will be conducted by CCSR. The SAR files will available to users within two months of delivery by ONS of an acceptable version. Dissemination of the SARs will be largely web-based - see page 2.

SARs Output Working Group: A meeting of the SARs Output Working Group took place in May. Papers for the meeting and the minutes from the meeting are on the CCSR web site. Further details of plans for 2001 SARs and issues which remain to be resolved are also on the SARs 2001 web pages.

Introducing the SARs 2001 team

Angela Dale, Ed Fieldhouse and Mark Brown will take responsibility for the overall direction of the SARs programme. The day-to-day work to support and develop the SARs will be undertaken by Jo Wathan and Yaojun Li, who will both work half-time on the SARs and half-time on other projects. They have a great deal of experience of large scale datasets, including the GHS, LFS, and the SARs.

Jo was awarded a PhD from The University of Manchester in May 2000 and since then has worked with Clare Holdsworth and Rachel Leeser on an ESRC-funded project to develop Alternative Household Classifications for the 2001 Census. She is currently dividing her time between the SARs and the development of teaching materials for the CHCC - see page 7. Jo will also be running some short courses in data analysis as well as contributing generally to the SARs training programme.

Yaojun Li joined the SARs programme last year from a research post at Edinburgh University. Alongside his half-time appointment to the SARs, he has just been awarded an ESRC grant (two days a week for two years) to develop a measure of social capital from the British Household Panel Survey. He is also developing teaching materials for STATA and will be running some short courses in STATA as well as contributing to the SARs training programme.

Sam Smith has just been appointed to the SARs team as a web interface developer - he represents a crucial member of the SARs team and will also be responsible for the CHCC web development. Sam has a degree in computer science and a great deal of experience of developing web sites and user interfaces.

Ruth Durrell and Margaret Martin will continue to provide front-line advice and support. They will also be organising seminars and workshops and ensuring that you get the help you need - as well as chasing you for information on your publications!

Help to users

From October, SARs queries will be logged and monitored on a helpdesk system. Email queries should now be sent to sars-helpdesk@man.ac.uk to ensure the fastest possible response. A dedicated telephone helpline is available on 0161 275 4735 from 9.00 - 5.00, Monday to Friday.

Registration

It is recognised by everyone that registration for census products must be much simpler than was the case for 1991. Registration for the SARs - and any other census products - will be the responsibility of the UK Data Archive. We expect that a single registration will apply to all census outputs that require registration. It will be implemented via an Athens-compliant user name and password and will apply to both 1991 and 2001 SARs.

Dissemination of the SARs

The SARs will continue to be free at the point of use for academics. However, as with the 1991 SARs, data will need to be purchased for non-academic use (e.g. for use by government, local authorities, health authorities, commercial organisations). It is important to emphasise that academics who use the SARs for research funded by any bodies other than the Research Councils or HEFCE, (for example DfES or a local authority) have to pay for them. The SARs do not fall within the Census Access Project.

The primary dissemination route for academic users will be via the web, although the SARs will still be obtainable on CD-ROM and also on-line from MIMAS.

The resource discovery route to the SARs

There are various ways in which users will be able to access the SARs on the web. One important route will be via the census portal being developed by the Data Archive as part of the Contemporary and Historical Census Collection (CHCC) of which CCSR is a partner (see page 7 for details). Users will be able to move from this portal to an interactive, exploratory interface before the requirement for registration.

This resource discovery interface will allow the user to explore and understand the different types of census output available. It will emphasise the inter-relationship of different census output products and will alert the user to the strengths and benefits of each type of output. This will allow a user to identify the topics of interest in the census and then to decide whether to use microdata or aggregate statistics.

This interface will be designed to capture the interest of students and first-time users who may, as yet, be unaware of the ways in which the census can be used in their research. User-friendly and accessible documentation on the SARs will also be available on the web.

The survey route to the SARs

We plan to use NESSTAR as an additional and alternative method of locating SARs for 1991 and 2001. NESSTAR is a tool for locating, exploring and extracting data that has been developed by the Data Archive and the Norwegian Statistics Office. Users who search for microdata using NESSTAR will be readily able to locate the SARs and find out more about them. All metadata for the SARs for 1991 and 2001 will be read into NESSTAR. Data for mini-SARs will also be available for exploratory analysis in NESSTAR before licensing. Exploration tools - either NESSTAR Explorer or NESSTAR Light - will be downloadable from the web, free of charge for academic use.

Web-based access to data on-line and extraction of subsets of data

Once a user has registered for the SARs, they will be able to access the files directly from the CCSR website. They will be offered a range of options:

Browsing, exploring on-line and extracting subsets and downloading to PC
We plan to implement two software solutions, both of which allow on-line browsing, exploration and downloading of selected subsets. Each has different strengths and together they will provide a way of ensuring that we meet the needs of the widest possible range of users.

a) Beyond20/20TM
Beyond20/20TM will provide exploratory data analysis on the web and will also allow the extraction of selected subsets of data to the user's PC in SPSS or SAS format, or as a comma-delimited file for reading into Excel or any other package. Extracts can be based on a selection of variables and/or cases. The extract can be written to files in SPSS or SAS with the labelling preserved or, for users who do not have ready access to a standard software package, exported to the Beyond20/20TM Professional Browser on their PC. The Professional Browser can be downloaded free of charge from the same web-site.

We expect Beyond20/20TM to be of great value in opening up the SARs to casual users or those with no access to standard packages. The Professional Browser supports exploratory PC-based data analysis including frequencies, recodes, tabulations, charts produced from tables, selection of population and display of metadata. It also has a mapping facility using boundary data held as MapInfo files.

b) NESSTAR
NESSTAR is an important tool for locating, exploring and accessing microdata held not just at the Data Archive but at other sites, including the Norwegian Data Archive. Like Beyond20/20TM, data and metadata are held in a Publisher on an NT Server. Users who go into NESSTAR from the Data Archive and are already registered for the Census outputs will be able to access the SARs in a seamless process that does not bother the user with the location of the data.

Users conduct their search for data through NESSTAR Explorer. This tool is down-loadable from the Web to the user's PC and provides a powerful search mechanism across data holdings in the UK and overseas which are linked via NESSTAR Publisher. Explorer provides extensive exploratory analysis facilities - including, for example, cross-tabulation and graphical display - and also statistical procedures including simple regression analysis. However, it does not have the ability to recode variables and, therefore, we will be making available a large number of standard recodes (e.g. age recorded in four or five different ways: as 5-year categories; 10-year categories; labour-market relevant categories, etc.) that can be selected by users. These variables will also be available to download as subsets of data.

NESSTAR Light provides an alternative which can be used with standard internet browsers (Netscape or Explorer) and runs directly from the fileserver. NESSTAR Explorer and Light are also able to allow users to extract subsets of variables direct to their PC. Files can be selected on the basis of variables or cases and extracted as SPSS, STATA and NSDstat datasets with all the variable labels etc. included.

We will be setting up these systems with 1991 data and, once they are working satisfactorily, holding meetings to demonstrate the new methods and dissemination and to obtain user feedback (see page 5).

Web-based extraction of entire datasets and documentation for 1991 and 2001
We plan to make fully-labelled versions of the SARs available in SPSS, SAS and STATA, and also as comma-delimited files for reading into Excel or other packages. These files will be held on the MIMAS Unix server at Manchester Computing and can be readily accessed over the web as portable files in the above packages. No specific software interface is required. We plan to implement this with 1991 SARs using existing SAR registration procedures as a first step.

Dissemination by CD-ROM will also be available. All microdata files for 1991 and 2001 and associated documentation and boundary data can be supplied on CD-ROM. For users who want a built-in exploratory analysis package, we will be able to supply the SARs together with the Beyond20/20TM Professional Browser. For other users, the data can be supplied in SPSS, SAS and STATA format. This is envisaged as the main method of dissemination to the non-academic sector.

Value added by computation of derived variables

Once the 2001 SARs are released, an immediate task will be to provide a comprehensive set of derived variables. For example, over 70 variables for the Household SAR will provide summary variables for the household and family, enhanced for 2001 through the household matrix in the census schedule. A range of social classification variables and derived income variables, which proved very valuable in 1991, will be repeated in 2001. The addition of an area-level classification to the 1991 SARs was another important innovation and we are already discussing this with ONS. Additional variables, developed through the ESRC-funded Alternative Household Classifications project, will further enhance the SARs for 2001.

In addition, a large set of standard recodes will be derived to facilitate quick and easy web-based exploratory analysis. These will include variables to give comparability with the classifications used in the Standard Tables and Census Area Statistics.

Documentation will be available on the web to guide users through the stages of downloading datafiles or extracting subsets of information where this is not part of a software package.

Training in the use of census microdata

From September 2001 we are re-launching our short courses designed to provide the skills needed to analyse census microdata files. For details, see http://www.ccsr.ac.uk/courses and page 6 which lists the course titles and dates.

From June 2002 we shall be running a series of workshops aimed at providing specific information and training in the census for 2001 and changes since 1991. We shall also use this as a way of obtaining feed-back on new methods of dissemination.

The outline programme of the first workshop, which will contain contributions from other census support units, is:

What's new in the 2001 census?
  • The range of output products available, including Census Area Statistics and migration data
  • Accessing the SARS via the web:
    • Beyond20/20TM
    • NESSTAR
  • Accessing other census products via the web
  • The analysis potential of the 2001 SARs
  • Analysing change between 1991 and 2001
  • Packages for analysing SARs
There will be a small charge for attendance, including lunch

Please let us have your views on this format, and any suggestions for changes or offers to host workshops.

Nearer the time of the release of the 2001 SARs, we shall also be running more workshops designed for specific interests. For example, in spring 2003 we are planning a joint workshop with Chris Gardiner from Sheffield Hallam University, which will be designed specifically around the interests of local authorities. Again, suggestions from users for other events are very welcome.

Pubtrawl

A new trawl of publications is now taking place. The list of publications based on the SARs forms an important resource for users as well as demonstrating to the ESRC the value of the data. The list is available from the CCSR web site and there is an on-line form that allows you to add any publications which are not listed.

Make sure your publications are counted!

CCSR Seminar Series

The autumn programme for the CCSR Seminar Series is included with this newsletter.

Short-course programme

Research Design and Data Analysis
Our new, enhanced Short Course Programme provides a range of courses in research design and analysis, all with a practical emphasis and applied focus. The programme is structured so that participants may either select an individual course which meets their needs, or build up their expertise through a portfolio of courses. Places on each course are limited to a maximum of 20.

The courses on data analysis are all PC-based and provide participants with the opportunity to complete detailed practical exercises on their own PC. Staff from CCSR or MIMAS are available to provide help and advice throughout the practical sessions. Each course will be supported by full documentation.

Level 1

SPSS for Social Scientists Jan 2002; Mar 2002
Surveys & Sampling Oct 2001; Jan 2002
Exploratory Data Analysis Oct 2001; Jan 2002
Basic Data Analysis Nov 2001; Feb 2002
Questionnaire Design Oct 2001, Jan 2002
STATA for the SARs Dec 2001
Introduction to STATA Feb 2002
Demographic Forecasting with POPGROUP Oct/Nov 2001, Jan 2002
Level 2
Analysing Hierarchical Surveys Nov 2001
Multiple Regression Jan 2002; Jun 2002
Logistic Regression Feb 2002, Jul 2002
Conceptualising Longitudinal Data Analysis March 2002
Introduction to Longitudinal Data Analysis March 2002
Level 3
Factor Analysis Mar 2002
Cluster Analysis Apr 2002
Multilevel Modelling May 2002
Design & Analysis of Complex Surveys Jan 2002
Longitudinal Data Analysis Apr 2002

For full details of courses, including dates, on-line booking and cost, please refer to our Short Course Programme web page - http://www.ccsr.ac.uk/courses/shorthome.htm or contact CCSR on 0161 275 4721.

Conference on the SARs
Manchester, Friday 16 November 2001

A conference to showcase some of the most innovative uses of the SARs will be held on 16 November. A booking form is included - please book soon to ensure a place. The day will start with a presentation on the 2001 SARs when we will be pleased to welcome Helen Hughes and Chris Lodge from the ONS who will be responsible for the extraction of the 2001 SARs. They will be able to provide the most up-to-date news on progress and answer your questions.

The Contemporary and Historical Census Collection

Teaching and learning materials for the SARS

CCSR is a partner is an £800k project funded by Joint Information Systems Committee (JISC) to deliver census-based learning and teaching to the UK Higher Education sector.

The project is funded by the JISC as part of its initiative to develop the Contemporary and Historical Census Collection (CHCC) into a major learning and teaching resource. The CHCC includes: the contemporary Census Area Statistics; Individual Level Data (the Samples of Anonymised Records (SARs)); the Historical Censuses Collection (the 1881 Census, the 1851 Census Sample and the Great Britain Historical Database). The main aims of the project are to increase use of the CHCC by:

CCSR is developing teaching and learning materials specifically for the SARs as well as contributing to a set of inter-disciplinary units that will draw on the full spectrum of census data.

The design and content of materials is being informed by consultation with potential users, which began with a workshop in London in January. This gave the clear message that materials need to be highly flexible in content and format to meet the different needs of teachers and learners. In response, the project is committed to producing a number of short, self-contained units drawing on census data to cover methodological and substantive topics across a wide range of disciplines and levels of difficulty. They will contain materials suitable for classroom teaching and self-study and be easily customised according to the depth of coverage required by teachers and students. The aim is for units that can be easily slotted into existing courses or combined to make new courses.

We have now drawn up plans for units based on SAR data. These include a target list of around 30 units based on both methodological and substantive topics. We are adopting a common format for units, comprising an overview, more detailed explanatory notes and practical exercises.

A chance to get involved

Draft versions of the first units have now been completed and are available for viewing and downloading on our project website: http://www.ccsr.ac.uk/rschproj/chcc/home.htm.

The success of the project depends on the delivery of materials that meet the real needs of teachers and students. It is therefore essential that we expose these prototype units as widely as possible to the scrutiny of potential users.

If you are involved in teaching in the FE and HE sector we would greatly value your feedback. Specifically we would be grateful for:

  1. your comments on the list of proposed topics, including any suggestions for modifications or additions
  2. your comments on the format and content of the first draft units
For those seeking greater involvement we are looking for:
  1. volunteers who are willing to pilot and evaluate any or some of our units in their own teaching
  2. expressions of interest from people able to contribute to the authoring of the subject specific units.

We will revise materials in the light of feedback so this is your chance to make sure we produce a new teaching and learning resource that can be of benefit to you.

For further information visit the project website or contact Mark Brown: mark.brown@man.ac.uk 0161 275 4780

CCSR project team: Mark Brown, Jo Wathan, Angela Dale and Mark Elliot.
Other project partners: MIMAS and the Census Dissemination Unit, University of Manchester; School of Geography, University of Leeds; History Data Service, The Data Archive, University of Essex; The LTSN Centre for History, Archaeology and Classical Studies and University of Glasgow.

New Projects at CCSR

Combining aggregate and micro-data to extend census tables for local areas
Award Holders: Mark Tranmer, Andrew Pickles
Funder: ESRC (£40K)
Dates: October 2001-September 2002

Aim:
To produce a robust method of extending published census tables for sub-national areas through synthetic estimation.

Objectives:

Summary:
For a given population of interest data are available in aggregate form for a specific set of tables. In some circumstances, at the local area level, these tables do not give sufficient detail for the census data user. In particular, the variables that are cross tabulated will have crude categories.

For the population of interest, microdata are also available for a sample of individuals. Using the microdata, it is often possible to specify a table with more detailed variable categories than that available on the aggregate data. That is, to extend the table. However, because the microdata are a sample, the estimates obtained from these data for the counts in each cell of the extended table are often inaccurate.

Since the aggregate data and the microdata are available for the same population, it is of interest to develop methods that combine the two datasets, assuming a particular statistical model, and to use this model to estimate the cell counts for the extended table. Several such models are possible. In this project we investigate this idea for a specific set of census tables. We then compare the different models, and assess whether the methods we have developed for the specific census tables may be used more generally.

Social capital: developing a measure and assessing its value in social research
Award Holders: Yaojun Li, Andrew Pickles and Mike Savage.
Funder: ESRC (£40K)
Dates: August 2001-July 2003

Aim:
This project will construct a theoretically informed and methodologically rigorous measure of social capital from the BHPS, validate it against a range of relevant dependent variables, and consider its potential explanatory significance in mediating people's socio-economic conditions and their quality of life. The measures developed will also provide the potential for group-level changes in social cohesion to be tracked over time.

Objectives:

  1. to develop theoretically-informed dimensions of social capital and measure the direction and strength of association between them,
  2. to assess the relative significance of the measures of social capital thus derived against expected outcomes in different aspects of quality of life,
  3. to establish the relationship between social capital and people's socio-economic conditions while taking into account ecological factors (neighbourhood-level characteristics), and
  4. to establish the nature by which and the extent to which social capital mediates the effects of socio-economic characteristics on the life experiences of the respondents.

Summary:
In the past twenty years, social scientists have been increasingly interested in the role of social capital in affecting people's lives, in contrast to other forms of resources such as economic or cultural capital. Nevertheless, none of the existing studies has used large-scale individual level data sets to conduct a theoretically-informed and methodologically rigorous analysis of social capital. This study uses the British Household Panel Survey (BHPS) to develop such a measure in its different dimensions and test its validity against a range of outcome variables. As most of the variables involved in the construction of the measure are categorical and as most of the other explanatory and response variables are also categorical or ordinal, item response models are used in developing the measure, and binomial, multinomial and ordinal regression models are used in testing its effects. The study focuses on the role social capital plays in mediating people's socio-economic conditions and the major aspects of their quality of life. The relationship between social capital, social trust and social exclusion is also investigated. Finally, the measures of social capital developed from the study will enable group-level changes in social cohesion to be tracked over time.

ESRC Contact SARs Support | CCSR
These pages are maintained by the SARs support team.
Send us comments on this web page.