Welcome to the OpenData Blog

Introducing the OpenData blog. We'll be sharing project updates, deep dives into open data infrastructure, and lessons learned building a platform for public datasets.

Creator of OpenData·Feb 25, 2026·3 min

Copied to clipboard

We finally have a blog. It took longer than it should have, but here we are.

OpenData started as a side project born out of frustration: public data is everywhere, but actually using it is a mess. You find a CSV on some government site, download it, fight with encoding issues, realize half the columns are undocumented, and repeat. We wanted a place where open datasets are discoverable, consistently formatted, and queryable without the usual headaches.

What we’ll write about

This blog is where we’ll share what we’re learning as we build this thing out. Expect posts on:

Infrastructure deep dives covering how we handle ingestion, storage, and querying at scale
Data source spotlights walking through interesting public datasets and what you can do with them
Project updates on new features, providers, and API changes
Lessons learned from the trenches of building data infrastructure

Quick taste of the API

If you haven’t poked around yet, here’s what querying a dataset looks like:

curl "https://opnhub.ai/v1/datasets/bls/unemployment-rate/query?limit=5"

{
  "data": [
    { "year": 2025, "month": 12, "rate": 4.1 },
    { "year": 2025, "month": 11, "rate": 4.2 }
  ],
  "total_rows": 1842,
  "columns": ["year", "month", "rate"]
}

Every dataset gets a stable API endpoint backed by Parquet files and DuckDB. No API keys needed for public data.

Following along

The whole platform is open source under Apache 2.0. If you’re into open data, data engineering, or just want to see how the sausage gets made, stick around. We’ll try to keep these posts practical and worth your time.

Copied to clipboard

More from OpenData

Why Your Charts Don't Get Shared (And Chartr's Do)

Chartr grew to 500K+ subscribers by making data visualization shareable. What they figured out about headline-first framing, minimal chrome, and social optimization applies to anyone making charts.

Riley Hilliard

Riley Hilliard·Mar 26, 2026

Store Flat, Transform on Read

Why we store all data in long format and apply transforms at query time instead of pre-computing views. A technical deep dive into DuckDB, Parquet, and the architecture behind OpenData's query engine.

Riley Hilliard

Riley Hilliard·Mar 19, 2026

70% of AI Training Datasets Have the Wrong License

A large-scale audit found that over 70% of popular AI datasets have missing or wrong license metadata. With the EU AI Act now enforcing training data transparency, this isn't just sloppy. It's a liability.

Riley Hilliard

Riley Hilliard·Mar 12, 2026

Public Data Has a Discovery Problem

Government data is technically public but practically inaccessible. Here's what that actually costs researchers, journalists, and anyone trying to answer a question with data.

Riley Hilliard

Riley Hilliard·Mar 5, 2026

The Hidden Mess Inside 'Clean' Government Data

Government data has a reputation for being clean and reliable. Anyone who's tried to ingest it programmatically knows that's not the full story. Here are the real encoding quirks, format traps, and silent failures hiding in data from FRED, BLS, Census, the World Bank, and the EPA.

Riley Hilliard

Riley Hilliard·Feb 19, 2026

The State of Open Data Infrastructure in 2026

A survey of the open data landscape: what data.gov, Socrata, FRED, Kaggle, Hugging Face, and Datasette do well, what's still broken, and where the connective tissue between data sources is finally being built.

Riley Hilliard

Riley Hilliard·Feb 12, 2026

Building a Headless Visualization Engine

How we separated chart computation from rendering by building a spec-driven visualization engine. The architecture behind @opendata/viz: four packages, a compilation pipeline, and zero DOM dependencies in the math layer.

Riley Hilliard

Riley Hilliard·Feb 5, 2026

Bootstrapping a Data Platform on Two Mac Minis

OpenData runs in production on two Mac Minis at $0/month infrastructure cost. Here's the architecture, the tradeoffs, and the specific triggers that would move us to cloud.

Riley Hilliard

Riley Hilliard·Jan 29, 2026

What Happens When All the World's Open Data Lives in One Place

Open data has a discovery problem, not an access problem. When you centralize datasets from hundreds of portals, entirely new capabilities emerge: knowledge graphs that reveal hidden connections, bridge datasets that make cross-agency joins possible, and a compounding network where every new dataset makes every existing one more useful.

Riley Hilliard

Riley Hilliard·Jan 22, 2026

Curious about open data? Start exploring.

OpenData makes public datasets discoverable, consistently formatted, and queryable without the usual headaches.

Browse thousands of public datasets
Query any dataset with a simple API
Download as CSV, JSON, or Parquet