Changes in Data Collection, Weighting, and Sample Design Introduced for the COVID Pandemic

The COVID-19 pandemic disrupted the collection of MEPS data in 2020, forcing a transition from face-to-face to telephone interviewing, among other disruptions, leading to lower-than-usual response rates. Further, MEPS sample design and sample weighting procedures depend on two other federal surveys, the Current Population Survey (CPS) and National Health Interview Survey (NHIS), which were also impacted by the national shutdown, resulting in nonresponse bias that had repercussions for MEPS Full Year 2020 (FY20) data quality. As a result, the Agency for Healthcare Research and Quality (AHRQ), the federal statistical agency responsible for the collection of MEPS data, implemented changes to MEPS sample design and survey weight methodology. In particular, AHRQ: 1. temporarily changed the MEPS panel design to incorporate three overlapping panels (rather than their typical two), and 2. adapted their survey weight methodology to adjust for underreporting in NHIS.

1. MEPS sample design changes: temporary introduction of additional sequential panels

To accommodate lower response rates during the pandemic, users should be aware of some key sample design changes implemented by AHRQ. MEPS is a longitudinal survey consisting of five interview rounds across a two-year time period. Panels are designed in a sequential, overlapping structure. In a typical year, there are two panels interviewed in parallel: one panel in the second year of data collection (Rounds 3, 4, and 5) and one in the first year of data collection (Rounds 1, 2, and 3) (see Figure 1). Due to the expected decline in respondents who would enter the survey in 2020 (Panel 25), AHRQ decided to extend Panel 23 for at least another year to increase the number of respondents in FY20 data (see Figure 2). For more information about panel design, see our IPUMS MEPS Panel Design User Note.

Figure 1: Typical MEPS sample design (post-2018)

Figure 2: Actual MEPS sample design (post-2018)

Adjustments to round-level variables

As described above, the collection of MEPS data about individual and family health status and conditions, sociodemographic characteristics, and health care events that take place over a two-year period proceeds over the course of five in-person interviews ("round," for short). For each round, the interviewer asks the family respondent to recall details related to health, health care access and utilization, and demographic changes that have occurred since the beginning of the calendar year (for the Round 1 interview) or since the previous interview. Because MEPS data collection takes place continuously over the course of each calendar year, the length of the time period (the "recall period") about which respondents are asked to remember and share information can, and does, vary. It also systematically varies by interview round (see Figure 4).

AHRQ's COVID-related introduction of additional interviews for households in Panels 23 and 24 necessitated adjustments to fit the expanded panel data into the traditional three-round-per-calendar-year MEPS data structure. In the original 2020 data produced by AHRQ, for example, the variables reserved for information collected during interview Rounds 3 and 5 (for the panel in its first year of data collection and the panel in its second year of data collection, respectively) also include information collected during Round 7 for members of Panel 24 households, even though the variable name on the data file remains unchanged from previous years of MEPS full year consolidated data files. In the IPUMS MEPS version of the data, the value of ROUNDRD ranges from 1 to 7 for the 2020 MEPS sample, with values of 5, 6, and 7 indexing round-level records containing information collected during the Round 5, 6 and 7 interviews with Panel 24 respondents. The COVID-related alterations made to MEPS data collection in 2020 also had the consequence of a more extended recall period for the sixth interview round than is typical (refer to Figure 4), with an average recall period duration of 7.5 months in contrast to 6 or fewer months for the usual interview round recall period.

Figure 3: Employment status by round (EMPSTATRD), 2019 and 2020

2. Changes to the development of Full Year person-level weights

Data Quality Issues

The addition of Round 6 to Panel 23 introduced potential bias due to its long recall period; interviews covered all events from January 1, 2020 to the date of the interview. The average recall period for respondents in Round 6 of Panel 23 was about 7.5 months, 2 months longer than the typical recall period for rounds that generally span for longer time periods such as Rounds 2, 3, and 4 (see Figure 3). This longer recall period in Round 6 resulted in underreporting of less salient events such as dental and office-based physician visits and sample weights for Panel 23 data were adjusted to address resulting potential bias in FY20 data. Under the typical sample design, there are only two panels that must be combined to produce annual health care estimates for the calendar year. With the extension of Panel 23, there were three panels of data that needed to be pooled together to produce estimates for calendar year 2020 (Panel 25 Rounds 1 through 3, Panel 24 Rounds 3 through 5, and Panel 23 Rounds 6 and 7).

Figure 4: Average Reference Period in Months by Panel and Round

Adjustments to sampling weight methodology

AHRQ staff developed 2020 full-year person-level weights by creating initial person-level weights for Panels 23, 24, and 25 separately. The weighting process involved two steps: first, AHRQ made adjustments to the initial person-level weights to account for nonresponse over time and then raked the nonresponse-adjusted weights to calibrate each panel to CPS population estimates based on age, sex, education, census region, and MSA status. AHRQ developed the initial Panel 23 person-level weight by assigning the individual 2019 full-year weights to 2019 survey participants present in 2020. In-scope survey participants (i.e. members of the civilian, noninstitutionalized US population) who joined the survey sometime in 2020 after being out-of-scope in 2019 were initially assigned the 2019 family weight. Next, AHRQ adjusted weights for person-level nonresponse during Rounds 6 and 7 and then raked to population control totals collected from the March 2021 CPS to reflect estimated population totals from December 31, 2020. As mentioned above, education was one of the six variables used by AHRQ in the raking process, however, there is evidence that the onset of the COVID-19 pandemic affected estimates of income and education in 2020 and 2021 CPS data. Without reliable education estimates from the CPS data, AHRQ adopted a regression approach to derive education control totals for weight raking. The same process was used to develop person-level weights for Panels 24 and 25, with one important difference: the 2020 MEPS Round 1 person-level weight was used as the “base” weight for Panel 25, instead of the 2019 full-year weight. AHRQ developed final person-level weights for 2020 by first combining weights from the three panels using factors representing the “effective sample size” of each panel (i.e., the proportion of the total sample size made up of each respective panel): 0.29 for Panel 23, 0.36 for Panel 24, and 0.35 for Panel 25. Finally, AHRQ staff raked person-level weights using the same six variables that were used to rake individual panels (age, sex, race, education, census region, and MSA status). For more information about how sampling weights are constructed in MEPS, please see our MEPS Sample Weights User Note.