Guidance on how to merge IPUMS MEPS data to MEPS-HC data from AHRQ

The current version of the IPUMS MEPS site offers person and round variables from the 1996-forward annual Full-Year Consolidated (FYC) MEPS Household Component (MEPS-HC) files. We are always adding more variables and samples to the IPUMS MEPS database, but have not yet added all of the thousands of variables from the original MEPS FYC files or variables from MEPS-HC files other than the FYC file. Below we offer guidance for users who wish to combine their IPUMS MEPS data extracts with additional data from the original MEPS-HC files downloadable from the Agency for Healthcare Research and Quality (AHRQ). We include explanations and sample code for three types of linkages that users might want to make between IPUMS MEPS data and original AHRQ files:
  1. IPUMS MEPS to other person-level MEPS-HC files (e.g., FYC, Longitudinal),
  2. IPUMS MEPS to MEPS-HC event-level files (e.g., inpatient hospital stays, prescribed medicines), and
  3. IPUMS MEPS to MEPS-HC medical conditions files.

NOTE: Change in Linking Keys Over Time

In general, each person in MEPS can be uniquely identified by combining the panel identifier (PANEL), the dwelling unit identifier DUID, and the person identifier PID. DUPERSID combines DUID and PID into a single variable.

There has been some change over time that directly affects the construction of linking keys in the original MEPS-HC data:

Merging IPUMS MEPS data to person-level MEPS-HC files from AHRQ

The FYC, Point-in-Time (PIT), and Longitudinal files are all person-level files; in other words, the unit of analysis in these files is the person. Aside from the aforementioned cases, MEPS-HC files contain records at levels other than person, such as health care event, medical condition, or job. Each person-level record can be uniquely identified by combining the variables PANEL and DUPERSID. In 1996-2017, DUPERSID is equivalent to concatenating DUID and PID. Starting in 2018, DUPERSID changed to include PANEL, DUID, and PID. IPUMS MEPS offers a single variable, MEPSID that combines all three of these variables in the same order: PANEL + DUID + PID.

Below we have included sample code to link an extract of the 2014-2018 data from IPUMS MEPS to additional variables from the original 2014-2018 MEPS-HC FYC file. Although in the sections below, we include sample code for linking to the event and conditions files for only a single year, the code in this section can be adapted for merging with multiple years of the event and condition files. Users can adapt this code to link IPUMS MEPS and MEPS-HC PIT and Longitudinal files simply by replacing the name of the FYC file with the name of the PIT or Longitudinal file. Refer to our notes about changes in the construction of linking keys over time. Note that not all persons found in the FYC files will also be in the PIT or Longitudinal files.


*Rename the original FYC Stata files:
_renamefile hc171.dta fyc2014.dta
_renamefile hc181.dta fyc2015.dta
_renamefile hc192.dta fyc2016.dta
_renamefile hc201.dta fyc2017.dta
_renamefile hc209.dta fyc2018.dta

#delimit;
*Create year and mepsid for linking, and select only desired variables
from original FYC files;

local years 2014 2015 2016 2017 2018;

foreach x in `years' {;
	use fyc`x'.dta, clear;
	gen year = `x';
	if `x' < 2018 {;
		tostring PANEL, replace force;
		gen mepsid = PANEL + DUPERSID;
	};
	if `x' >= 2018 {;
		rename DUPERSID mepsid;
	};

	*update varlist below with names of variables you wish to retain;
	keep mepsid year [varlist];

	save fyc`x'_short.dta, replace;
};

*Append shortened files together to create a single file for the merge
to IPUMS;

use fyc2014_short.dta, clear;
foreach x in 2015 2016 2017 2018 {;
	append using fyc`x'_short.dta;
};

*Perform a 1-to-1 merge between the original 2014-2018 FYC data and the
2014-2018 IPUMS data;

*update line below with number of IPUMS extract;
merge 1:1 year mepsid using meps_[IPUMS extract number].dta;

*clean up temp files;
foreach x in `years' {;
	rm fyc`x'_short.dta;
};
#delimit cr

The above code is for the Stata stats package. Need help translating it into a different stats package? Let us know: ipums@umn.edu

Back to Top

Merging IPUMS MEPS data to event-level MEPS-HC files from AHRQ

The set of MEPS-HC data files include several with health care event-level records, with the unit of analysis defined as a visit for most event files, as a stay for the Inpatient Hospitalizations file, or as a purchased prescribed medicine for the Prescribed Medicines file. Note that, in the case of the Prescribed Medicines file, each refill of a prescription medication purchased during the calendar year has its own record. In each event file, there can be a range of zero to many event records associated with each person in the FYC file for the same year. The MEPS-HC event files are:

Below we have included sample code to link an extract of the 2017 data from IPUMS MEPS to additional information from the original 2017 Inpatient Hospitalizations file. This code can be easily adapted to link IPUMS MEPS data and other MEPS-HC event files by replacing the hospitalization file name and list of variables to be retained with another event file's name and variable list. Note that not all persons found in the FYC files will also have records in all, or even any, of the event files.


*Read in the original 2017 inpatient hospitalizations file
use h197d.dta, clear

*Generate mepsid to link with the IPUMS MEPS extract
tostring PANEL, replace force

*For 1996
*gen mepsid = "1" + DUID + PID
*For 1997-2017
gen mepsid = PANEL + DUID + PID
*For 2018-forward
*gen mepsid = DUPERSID

*Retain only linking key and desired additional variables
keep mepsid [varlist]

*merge inpatient hospitalizations file to IPUMS MEPS extract
*using a "many-to-one" merge
merge m:1 mepsid using meps_[extract number].dta

Back to Top

Merging IPUMS MEPS data to the MEPS-HC medical conditions file from AHRQ

The MEPS-HC interview collects information about household-reported medical conditions that either 1) have an associated health care event during the current calendar year or 2) are defined by AHRQ as "priority conditions," conditions designated as a priority due to their prevalence, expense, or policy relevance. These data are delivered on a file where the unit of analysis is a medical condition. Below we provide sample code to link an IPUMS MEPS extract and the 2017 medical conditions file.


*read in the original 2017 medical conditions file
use h199.dta, clear

*create linking key
tostring PANEL, replace force

*For 1996
*gen mepsid = "1" + DUID + PID
*For 1997-2017
gen mepsid = PANEL + DUID + PID
*For 2018-forward
*gen mepsid = DUPERSID

*retain linking key and variables of interest
keep mepsid [varlist]

*many-to-one merge on mepsid
merge m:1 mepsid using meps_[extract number].dta

Back to Top