The CenSoc team is excited to announce the release of the CenSoc-DMF dataset version 4.0. This dataset, which links men in the 1940 Census to the Social Security Death Master File (DMF) and previously included mortality from the years 1975-2005, has now been extended to cover the period of 1975-2023. Data for all years are contained in a single file.

As shown on the lexis surface below, this extension increases mortality coverage primarily for cohorts born after about 1905, reducing truncation (which can lead to attenuation bias) for many cohorts and increasing the ability to observe more recent cohorts’ older-age mortality. We recommend using cohorts in the range of 1900-1930 for analyses, as these are well-captured at ages 65+ in the CenSoc-DMF.

 Figure: the 1940 Census and DMF high-coverage area represented on a lexis surface, with the cohorts of 1915 and 1930 highlighted in blue.

In order to create this dataset, we had to account for decreasing completeness of the DMF after 2005, which occurred due to policy changes limiting which DMF records can be released to the public. Over 2005-2016, the DMF declined from over 95% death coverage for ages 65+ to under 15%. Moreover, this decline happened at different times for individual states. We weight the linked CenSoc-DMF by state of birth, race, and age at death to account for these patterns in the DMF over time, as well as other selection that may be present in the CenSoc-DMF.

Below, for example, we show that this varying state-level timing in decline can give rise to unexpected relationships between longevity and state of birth for later cohorts, who are most acutely affected by incompleteness in the DMF after 2005. Weights help to correct for this issue.

Figure: OLS estimates of longevity for men born in Alabama relative to those born in Minnesota, by cohort. For cohorts 1925 onward, unweighted estimates (in blue) appear to show that Alabama-born men live over a year longer than Minnesota-born men, a reversal of the relationship for less recent cohorts. These unexpected results for more recent cohorts are driven by issues with state-specific death coverage in the DMF. Weighted estimates, however, consistently show shorter longevity for Alabama-born men.

Users should be aware that data are only weighted from 1975-2020, ages 65-100, and cohorts before 1939. Data outside these ranges are available to use at researchers’ discretion.

For a detailed description of the Death Master File and weighting of the CenSoc-DMF, refer to technical documentation on the CenSoc documentation page. Visit the CenSoc data page for download links and codebooks for the CenSoc-DMF.

Story by Maria Osborne. For inquiries, contact censoc@berkeley.edu

Leave a Reply

Your email address will not be published. Required fields are marked *