The CenSoc team is pleased to announce the release of four new and updated World War II Era Army Enlistment datasets for 2023.
First, we have released the CenSoc WWII Army Enlistment Dataset (Version 1.0, N = 9.0 million), a cleaned and harmonized version of the National Archives and Records Administration’s Electronic Army Serial Number Merged File. This standalone file contains data on American Army service members, including members of the Army Air Corps, Women’s Army Auxiliary Corps, and Enlisted Reserve Corps, who enlisted ca. 1938-1946. It is a rich source of data on enlistee sociodemographic characteristics, military service, and anthropometry. Available variables include birthplace, educational attainment, marital status, race, and army branch. This file is a uniquely large source of body height and weight data, with these measurements available for approximately 5 million individuals. For a full list of variables, the codebook for this file is available here.
In addition, we have published three datasets that link men in the CenSoc WWII Army Enlistment Dataset to Social Security Administration mortality records and/or 1940 Census records. All links were established using a conservative variant of the ABE matching algorithm developed by Abramitzky, Boustan, and Eriksson (pp. 871-872). These datasets are as follows:
- The CenSoc Enlistment-Census-1940 file (Version 1.0, N= 2.6 million) links enlisted men to the 1940 Census. Researchers can merge this file with 1940 Census data from IPUMS-USA using the HISTID unique identifier variable. This dataset is suitable for researchers looking to attach contemporaneous census information such as wage income to enlistment records, but is not linked to death records. [Codebook]
- The CenSoc Enlistment-Numident file (Version 2.0, N= 1.7 million) links enlisted men to the Berkeley Unified Numident Mortality Database (BUNMD), a cleaned version of Social Security Numident records. This file consists of men who died 1988-2005 and contains year of birth/death, age of death, and information from Social Security applications/claims records, in addition to variables from the CenSoc WWII Army Enlistment Dataset. [Codebook]
- The CenSoc Enlistment-DMF file (Version 2.0, N= 1.9 million) links enlisted men to the Social Security Death Master File (DMF). This file consists of men who died 1975-2005 and contains year of birth/death and age of death, in addition to variables from the CenSoc WWII Army Enlistment Dataset. [Codebook]
Both the CenSoc Enlistment-Numident file and CenSoc Enlistment-DMF file are mortality datasets that link enlisted men to Social Security death records, and also contain links to the 1940 Census where possible. To assist researchers in choosing a dataset that best suits their needs, a comparison of the files is presented in the table below:
Enlistment-Numident | Enlistment-DMF | |
Size | 1.7 Million | 1.9 Million |
Mortality information for deceased service members | ✓ | ✓ |
Larger window of mortality data | ✓ | |
More mortality records per year | ✓ | |
More variables from Social Security records | ✓ | |
Links to 1940 Census available | ✓ (762 Thousand) | ✓ (779 Thousand) |
Links to download all datasets and codebooks can be found on the CenSoc data page. For more information on the creation of CenSoc WWII Army Enlistment datasets, please refer to our technical documentation.
Story by Maria Osborne and Anna Wikle. For questions, please email censoc@berkeley.edu