The Variance Is the Story

Open Data

CDC WONDER suppresses your county data. CMS gives you 328,000 rows with cryptic headers. You have four days. Here's what happens when you put both datasets in one place.

Riley Hilliard
Riley Hilliard
Director of High-Fives·Apr 18, 2026·9 min
Copied to clipboard
The Variance Is the Story

Say you get a tip: a state legislator’s office says overdose deaths jumped forty percent last year, even though opioid prescribing has been falling for a decade. Either the number is wrong, or the thing everyone did to fix the crisis didn’t fix it. You need to check both claims against federal data by Friday.

You already know where the data lives. CDC WONDER for mortality, CMS for Medicare Part D prescribing. You also know the problems. WONDER suppresses any county cell with fewer than ten deaths and caps most queries at 75,000 rows. The CMS file is 328,890 rows with headers like tot_opioid_prscrbrs and la_opioid_prscrbng_rate_5y_chg. The two agencies use different geographic identifiers. Before you can say anything about the tip, you have to spend a morning downloading, reconciling, and cleaning files that were never designed to work together.

OpenData puts both datasets in one place: same API, same filtering syntax, same column names that actually make sense. The cross-agency join that eats your morning is a URL parameter.

What CDC WONDER wants from you

Open WONDER’s multiple cause-of-death interface and you will click through a dozen grouping menus before you can submit a single query. Ask for county-level opioid deaths and the interface returns rows like this:

"Notes"   "County"           "Year"  "Deaths"     "Population"  "Crude Rate"
          "Autauga, AL"      "2021"  "Suppressed" "58,805"      "Suppressed"
          "Baldwin, AL"      "2021"  "Suppressed" "231,767"     "Suppressed"
          "Jefferson, AL"    "2021"  98           "674,721"     14.5
"Suppressed counts are not shown to protect confidentiality."

Cells with fewer than ten deaths are withheld for privacy. Most queries are capped at 75,000 rows. County-level drug deaths come back sparse for half the map, and the export is a tab-delimited file with hierarchical group columns stacked ahead of the data. NICAR tipsheets are explicit about the workaround: widen the geography, or call the state health department. That advice is correct. It is also not a four-day plan.

The CMS file is a different problem. Medicare Part D opioid prescribing by geography arrives as a 328,890-row CSV with headers like tot_opioid_prscrbrs and opioid_prscrbng_rate_5y_chg. Every row has three breakout dimensions (overall, rural, urban) stacked in a single table. Pulling the state-level overall rate requires filtering on two of them before anything else lines up.

This is the work the opioid crisis has buried reporters under for fifteen years. Data is Plural flagged the CMS file the week it dropped. Getting it into shape was still its own afternoon.

The curve rose eightfold in eight years, then pulled back. The 2024 decline is real but provisional. Every number in this chart came from one query.

That national shape is the context. But the story you got a tip about is a single state. For that, you need to see what happened in Oregon specifically, and for that you need data from two agencies that don’t talk to each other.

Same data, before and after

Here’s what you actually work with when you pull prescribing data from CMS, followed by the same data on OpenData.

That is the state-level overall opioid prescribing rate for Oregon, buried inside a table with three overlapping breakout dimensions and column names that require the CMS data dictionary to decode. prscrbr_geo_cd is the FIPS code. la_opioid_prscrbng_rate is the long-acting opioid rate. tot_opioid_clms is total opioid claims.

Two agencies, two completely different source formats, one consistent experience. The CMS file needed three filters to isolate the state-level overall rate. The CDC file needed a specific period filter because mortality data is published as overlapping 12-month rolling windows. On OpenData, both return clean results with the same filtering approach.

What the clean data shows

Once the formatting headaches are out of the way, the actual analysis takes minutes, not a morning.

Prescribing fell everywhere

The supply-side intervention that followed the 2016 CDC Prescribing Guideline did what it was supposed to do in every state. New Hampshire cut the hardest at 44%. Even states that started with lower rates dropped by at least a third. This is the chart a legislative aide would show to argue that prescribing reform worked. It did work. The next chart shows what happened anyway.

New Hampshire cut prescribing the hardest. Every state on this chart reduced prescribing by at least 30%. The policy worked as designed.

Deaths rose almost everywhere anyway

Every state on the previous chart cut prescribing by at least a third. Look at what happened to deaths over the same period. Oregon’s deaths rose 343% while its prescribing fell 34%. New Hampshire cut prescribing the hardest (44%) and deaths barely moved. The prescribing bars are nearly uniform. The death bars are not. That gap is the story.

Five states, same intervention, wildly different outcomes. Prescribing fell uniformly. Deaths did not.

Oregon: the 2020 hinge

Year-by-year, Oregon’s death count was essentially flat between 306 and 346 from 2015 to 2019. Then it doubled in a single year, then doubled again, then hit 1,454 in 2023. That inflection is what illicit fentanyl arriving at scale looks like in the data. Oregon’s fentanyl-specific deaths went from 33 in 2015 to 1,338 in 2023, a 40x increase over eight years. Heroin deaths collapsed from 158 in 2018 to 20 in 2024 as users switched to a cheaper, stronger supply.

Deaths were essentially flat for five years. Then fentanyl arrived. Oregon's fentanyl-specific deaths went from 33 in 2015 to 1,338 in 2023, a 40x increase.

The supply data confirms what the death data implies. UNODC seizure reports show fentanyl was barely a blip in 2015, just 95 kg seized nationwide. By 2022, US law enforcement was pulling 16,123 kg off the market. Over the same period, heroin seizures fell from 9,485 kg to 560 kg. The crossover happened in 2020, the same year Oregon’s death curve broke upward.

Fentanyl replaced heroin in the supply chain. The crossover happened in 2020, the same year deaths broke upward in Oregon and most other states.

The join that eats the morning

To answer “did prescribing reform work in Oregon?” you need CMS prescribing rates (keyed on prscrbr_geo_desc with breakout dimensions) and CDC death counts (keyed on state_name with period filters). Normally you download both files, reconcile the geographic identifiers, align the time grains (CMS is annual, CDC is monthly rolling windows), and merge. On OpenData, both datasets live in the same system.

SELECT
  d.state_name AS state,
  d.death_count AS deaths_2023,
  p.opioid_prscrbng_rate AS rx_rate_2023,
  ROUND(
    (d.death_count - d0.death_count) * 100.0
    / d0.death_count, 1
  ) AS death_change_pct,
  ROUND(
    (p.opioid_prscrbng_rate - p0.opioid_prscrbng_rate) * 100.0
    / p0.opioid_prscrbng_rate, 1
  ) AS rx_change_pct
FROM "cdc/drug-overdose-deaths" d
JOIN "cdc/drug-overdose-deaths" d0
  ON d.state_name = d0.state_name
JOIN "cms/medicare-opioid-prescribing" p
  ON LOWER(d.state_name) = LOWER(p.prscrbr_geo_desc)
JOIN "cms/medicare-opioid-prescribing" p0
  ON p.prscrbr_geo_desc = p0.prscrbr_geo_desc
WHERE d.year = 2023
  AND d.drug_category = 'All Opioids'
  AND d.period = '12 month-ending'
  AND d.month = 'December'
  AND d0.year = 2015
  AND d0.drug_category = 'All Opioids'
  AND d0.period = '12 month-ending'
  AND d0.month = 'December'
  AND p.year = 2023
  AND p.breakout_type = 'Totals'
  AND p.breakout = 'Overall'
  AND p0.year = 2015
  AND p0.breakout_type = 'Totals'
  AND p0.breakout = 'Overall'
ORDER BY death_change_pct DESC

You don’t need to write SQL to use the platform (the earlier examples in this article use simple URL filters). But the option is there when you need to combine datasets in ways that a filter can’t handle. Here’s what that join looks like visually:

Both series indexed to 2015 = 100. One line goes down. The other goes up. This finding requires combining two datasets from two agencies.

One line goes down. The other goes up. You can’t see this in any single government dataset. It only shows up when you combine mortality data with prescribing data.

Finding what you don’t know exists

The join is the hard part. But before you can join anything, you have to find it. Knowing that the CDC publishes overdose deaths is table stakes. Knowing that SAMHSA tracks the insurance status of people entering opioid treatment, or that UNODC reports fentanyl seizures by country, requires either institutional knowledge or hours of browsing agency websites.

On OpenData, a search for “opioid” returns results across every provider. Every dataset page shows related datasets that share geographic, temporal, or thematic connections. The CDC overdose deaths page surfaces CMS prescribing data because they share the same state-level geography and overlapping time ranges. The platform’s dataset graph makes these relationships visible:

Each node is a dataset, colored by category. Edges represent detected similarity. The opioid datasets from this article cluster together because they share joinable geographic and temporal dimensions.

OpenData also ships with agent skills for Claude, ChatGPT, and other AI tools. Load the skill, point an agent at a research question, and it hits the platform’s discovery endpoint, finds relevant datasets, queries them, and comes back with findings and source attribution. Tell an agent to investigate the opioid supply chain and it will find UNODC seizure data, SAMHSA treatment admissions, CMS prescribing rates, and CDC mortality data on its own. It will notice that SAMHSA’s treatment data includes insurance status, which gives you a health-coverage angle you might not have considered. The agent doesn’t replace the journalism. It replaces the hours you’d spend navigating agency sites and hoping you stumble across the right dataset.

The datasets

Every dataset referenced in this article is on the platform. Here’s what’s available.

Mortality and overdose: CDC provisional drug overdose death counts (67,242 rows, monthly by state, 2015-present, 11 drug categories including fentanyl and heroin breakouts), NCHS county-level drug poisoning mortality (59,584 rows, 2003-2021, model-based rates that fill the gaps WONDER suppresses)

Prescribing: CMS Medicare Part D opioid prescribing rates (328,890 rows, 2013-2023, national/state/county/ZIP levels, with rural/urban breakouts and year-over-year change columns)

Treatment: SAMHSA TEDS treatment episodes (1,625,833 rows, 2023, individual-level treatment admissions with substance, demographics, insurance status, and injection drug use flags)

Supply side: UNODC World Drug Report (19,322 rows, international drug seizures by country, drug type, and year from WDR 2021 and 2025 editions, includes US fentanyl and heroin seizure quantities)

NCHS drug poisoning mortality is worth calling out on its own. The model-based estimates use statistical smoothing to produce reliable county-level death rates even where raw counts would be suppressed by WONDER. It’s the county-level dataset that WONDER can’t give you.

Show your work

Before you file, the fact-checker wants to know where the number came from. “1,454 opioid deaths in Oregon in 2023” needs a source link, not a screenshot of a spreadsheet. Every row in an OpenData response carries a system column called _source_url that points back to the authoritative file the row was parsed from. Pass include_sources=true on any query and it appears alongside the data.

That URL points at the CDC file OpenData pulled the row from. Your editor clicks the link, sees the primary source, and moves on. If CDC revises a number, the response reflects it the next time it syncs. Nothing is scraped into an opaque blob.

When an AI assistant pulls an overdose count from OpenData, it’s the actual CDC number, not something the model guessed from training data. The source URL exists. Read the API docs for the full specification.

Hadley Wickham’s line still holds: your default position should be skepticism. Provenance is what makes skepticism possible.

Friday morning

Friday morning, you open the draft. The legislator’s claim was about one state. The question under it was structural. The charts answered it: prescribing reform worked in every state, and deaths rose anyway in almost every state, because the thing killing people stopped being the prescription pad years ago.

Almost every state. New Hampshire cut prescribing the hardest, 44%, and deaths barely moved. By 2024 its provisional count had fallen to 246. That outlier is its own assignment. The national chart told you where to look. The variance is the story.

A week’s worth of reporting compressed into a few queries. The cells are not suppressed. The file has a URL your editor can click.

Every dataset in this article is free to query at tryopendata.ai with no signup required.

Datasets used:

  • cdc/drug-overdose-deaths — CDC provisional drug overdose death counts, monthly by state, 2015-present. 67,242 rows, 11 drug categories including synthetic opioids (fentanyl) and heroin breakouts. Filtered to period = '12 month-ending', month = 'December' for annualized counts. Source: CDC National Center for Health Statistics, Vital Statistics Rapid Release.
  • cms/medicare-opioid-prescribing — Medicare Part D opioid prescribing rates by geography, 2013-2023. 328,890 rows with national/state/county/ZIP levels and rural/urban breakouts. Filtered to breakout_type = 'Totals', breakout = 'Overall' for state-level overall rates. Source: CMS.gov.
  • cdc/nchs-drug-poisoning-mortality — NCHS model-based county-level drug poisoning mortality estimates, 2003-2021. 59,584 rows. Uses statistical smoothing to produce reliable rates where raw counts would be suppressed by WONDER. Source: NCHS.
  • samhsa/teds-treatment-episodes — SAMHSA Treatment Episode Data Set, 2023. 1,625,833 individual-level treatment admissions with substance, demographics, insurance status, and injection drug use flags. Source: SAMHSA.
  • unodc/world-drug-report — International drug seizures by country, drug type, and year. 19,322 rows compiled from WDR 2021 (2015-2019) and WDR 2025 (2019-2023) editions. US fentanyl and heroin seizure quantities in kilograms. Source: United Nations Office on Drugs and Crime.

Calculations & transformations:

  • Prescribing rate change computed as percentage change between 2015 and 2023 values for each state’s overall Medicare Part D opioid prescribing rate.
  • Death count change computed as percentage change between 2015 and 2023 twelve-month-ending December counts for “All Opioids” category by state.
  • Oregon indexed chart: both series indexed to 2015 = 100. Prescribing index uses the state’s annual overall rate; death index uses the twelve-month-ending December “All Opioids” count.
  • National fentanyl death chart uses the “Synthetic Opioids” drug category from CDC VSRR, twelve-month-ending December. The 2024 value (48,661) is provisional and subject to revision.
  • UNODC seizure data filtered to United States, fentanyl and heroin drug groups. Quantities reported in kilograms as published.

Limitations:

  • CDC provisional death counts are subject to revision and may undercount recent months due to reporting lag. The 2024 decline shown in the national chart is real but provisional.
  • Medicare Part D prescribing data covers Medicare beneficiaries only (primarily age 65+), not the general population. Prescribing trends in younger populations may differ.
  • UNODC seizure data has a multi-year reporting lag. The most recent WDR edition (2025) covers through 2023. Seizure quantities reflect law enforcement activity and do not directly measure supply volumes.
  • CDC WONDER suppresses county cells with fewer than ten deaths. The NCHS model-based estimates fill these gaps with statistical smoothing but are modeled values, not raw counts.
  • The cross-agency comparison (CMS prescribing vs. CDC deaths) uses different populations. Prescribing rates reflect Medicare enrollees; death counts reflect all residents regardless of insurance status. The directional finding (prescribing down, deaths up) holds, but the populations are not identical.
  • State-level analysis masks sub-state variation. County-level patterns within Oregon, for example, may diverge significantly from the state trend.
  • SAMHSA TEDS data is referenced in the text but not charted. Treatment admissions are voluntary reports from state agencies and do not represent all treatment episodes nationally.

Data accessed on 2026-04-18 via the OpenData API.


OpenData is in active development. The datasets in this article live at tryopendata.ai.

Datasets used in this article

All datasets are queryable via API. Filter, sort, and download as CSV, JSON, or Parquet.

Riley HilliardRiley Hilliard

Director of High-Fives

At 13, I secretly drilled holes in my parents' wood floor to route a 56k modem line to my bedroom for late-night Age of Empires marathons. That same scrappy curiosity carried through 3 acquisitions, 9 years as a LinkedIn Staff Engineer building infrastructure for 1B+ users, and now fuels my side projects, like OpenData.

Copied to clipboard

More from OpenData

The CDC and CMS datasets in this article are free to query, and every row carries a source URL back to the original government file.

OpenData makes public datasets discoverable, consistently formatted, and queryable without the usual headaches.

Query the opioid datasets
  • Filter by state, year, or drug category
  • Row-level source URLs for every record
  • Export as CSV, JSON, or Parquet