← All Posts
April 5, 2026 Data Workshops

5 Public Datasets to Practice With

DatasetsPracticeResources

One of the best ways to learn data analysis is to work with data you find genuinely interesting. Here are five public datasets that are great for practice.

1. NYC Open Data

The City of New York publishes thousands of datasets covering everything from 311 complaints to restaurant inspection grades. It’s local, it’s relevant, and it’s massive.

Good for: SQL practice, exploratory analysis, geospatial data

2. Kaggle Datasets

Kaggle hosts thousands of community-uploaded datasets on every topic imaginable, from Spotify listening history to global climate data. Many come with starter notebooks.

Good for: Guided exploration, machine learning practice

3. US Census Bureau

Demographics, economics, housing — the Census Bureau is one of the richest data sources available. The American Community Survey alone has hundreds of variables.

Good for: Statistical analysis, demographic research, data joins

4. FiveThirtyEight Data

The journalism site publishes the data behind their articles on GitHub. Sports, politics, economics — all cleaned and ready to analyze.

Good for: Reproducible analysis, learning how journalists use data

5. World Bank Open Data

Global development indicators across every country: GDP, literacy rates, life expectancy, and more. Great for time-series analysis and international comparisons.

Good for: Time-series analysis, visualization, international comparisons


Pick one that interests you, download it, and start asking questions. That’s how the learning happens. And if you want to do it with other people, join us at a workshop.