Public Use Microdata (1 and 5 percent)

DESCRIPTION: The Public-Use Microdata Samples (PUMS) are samples of housing units taken from the decennial censuses with information on the characteristics of each unit and each person in it. While preserving confidentiality (by removing identifiers), these microdata files permit users with special data needs to prepare virtually any tabulation.

There are two sub-collections of PUMS files: a one percent sample and a five percent sample. The 1% sample files have approximately one record for every 100 persons/households in an area, while the 5% files have approximately one out of 20. The universe from which the PUMS records are drawn is restricted to households/persons who received the long form; i.e., the data are based strictly on the long form questionnaire. The smallest unit of geography on the 1% PUMS files is called a "super- PUMA "; these geographic entities have a minimum population of 400,000.

The 5% PUMS file records contain a different set of PUMA geographic areas, each of which has a minimum population of 100,000. These PUMAs are primarily based on counties, and may be whole counties, groups of counties, or places. When these entities have more than 200,000 persons, PUMAs can represent parts of counties, places, etc. None of these PUMAs on the sample crosses state lines.

The 1% sample was based primarily on metropolitan/nonmetropolitan areas, and contains PUMAs which were made from whole central cities, whole Metropolitan Statistical Areas (MSAs) or Primary Metropolitan Statistical Areas (PMSAs), MSAs or PMSAs outside the central city, groups of MSAs or PMSAs, and groups of areas outsides MSAs or PMSAs. When the areas have more than 200,000 persons, 1% PUMAs can represent parts of central cities, MSAs/PMSAs, and so forth. 1% PUMAs may cross state lines and in that case state codes are not shown.

Using the Data: The PUMS files are hierarchical files consisting of household records and person records. Household records - identified with an "H" in the first column - contain information pertaining to an entire household or group quarters residence. Each household record is followed by a series of person records - identified by a "P" in the first column - which contain information about each sampled individual in the unit. Most analyses of IPUMS data require manipulation of both household and person records. Each record contains a serial number which links the persons in the housing unit to the appropriate household record. Many statistical software packages - such as SAS, SPSS, Stata and BMDP - will handle the IPUMS datasets easily. They offer various methods to make the hierarchical file structure rectangular by attaching household information to each person-record.

CODEBOOKS: The codebooks for 1990 and 2000 PUMS are available in pdf format on the local area network (R:\CFDR\Public\Data\IPUMS international).

DATA: Included in the local area network file are separate folders for 1% PUMS 2000, 5% PUMS 2000, and 1% PUMS 1990. Within each folder there are compressed raw data files and the SAS input programs. The most recent PUMS are also available for download through the Census Bureau website (www.census.gov) and historical datasets may be obtained through the Integrated Public Use Microdata Series (IPUMS) website (www.ipums.umn.edu/usa/index.html). Users may also contact the CFDR for assistance in obtaining the data.

Updated: 12/01/2017 10:42PM