Progress Report, October 2006
Following a meeting of the NAPP collaborators in Montreal in April 2005, we continued working on a final release of harmonized data. This was somewhat delayed by the project co-ordinator being on fellowship leave for an academic year.
We converted the metadata for NAPP to XML, a process that required a significant behind-the-scenes commitment by project staff at the Minnesota Population Center. The new XML metadata structure substantially reduces the time involved in editing and creating metadata and data, and will be crucial to adding a significant number of new datasets and variables for the next phase of NAPP.
The first phase of NAPP was funded to harmonize and distribute the complete-count census data from Canada, Great Britain, Iceland, Norway and the United States. The second phase of NAPP—which we call NAPP-II—will (1) incorporate all the samples of census data for these five countries between 1850 and 1930; add new complete count censuses from Iceland (1703, 1835, and 1845) and Norway (1801) (2) add complete-count censuses from Sweden (1890 and 1900), and (3) Create samples of individuals and couples linked between the samples and the complete-count censuses. (See the proposal for NAPP-II)
In October 2006 we released four new datasets (Canada 1871 and 1901, Norway 1865, Scotland 1881) and added a significant number of new variables to the existing datasets. Further information is available on the revision history page. The datasets were released onto a new extract system that is more stable and attractive, and provides features for users, such as revision of old extracts, lacking in previous releases.
The three censuses of Iceland that we plan to release have required somewhat more work than we anticipated. Some of the data was missing variables, even though it had been transcribed for all people. Other censuses had all variables transcribed for a subset of the population. Data entry of the missing cases and variables has been undertaken by students from the University of Iceland in the summers. We expect to release this data in winter 2007. In total, the three censuses of Iceland that we will release include 270,000 people. Because of the small size of the datasets, we expect that processing time will be relatively quick once data entry is complete.
The final NAPP data release was completed in fall 2006. In the winter of 2006/7 we are working on the following enhancements to the data:
- Addition of new constructed variables describing farm residence, group quarters status, and migration status.
- Expanding the documentation in the following areas:
- Cross-country differences in enumeration procedures
- Procedural history of the creation and processing of the datasets
- Occupational classification and coding
- Constructed family and household relationship variables
- Enhancing the variable availability for Canada 1901, and adding a second Canada 1901 sample that increases the sample density in Montreal, Toronto, Vancouver, Halifax and Winnipeg to 15-20 per cent of the population.
- Preparing the Swedish 1890 census for release.
- Preparing for a beta-release of individuals linked between the U.S. census samples, and the 1880 complete-count dataset.