Nov 10: Data Scavenger Hunt
Learning Objectives
After today's class, you should be able to:
- Demonstrate how to read/write lines from/to plain text files.
- Read data files written in CSV and JSON format into memory.
- Write short Python programs to answer questions from data.
Reminders¶
Nested Data¶
[20 min]
- CORGIS
- Collection of Really Great, Interesting, Situated Datasets
- Example: County Demographics
- CSV format (nested lists)
- JSON format (nested dictionaries)
- Python documentation
Tip
Install the Rainbow CSV extension in VS Code to enable syntax highlighting for CSV files.
CSV example¶
import csv
from pprint import pprint
data = []
with open("county_demographics.csv", newline="") as file:
reader = csv.reader(file)
names = next(reader) # Read the first line
data = [row for row in reader]
pprint(data[0])
JSON example¶
import json
from pprint import pprint
with open("county_demographics.json") as file:
data = json.load(file)
for item in data:
if item["County"] == "Harrisonburg city":
pprint(item)
Scavenger Hunt¶
[30 min]
Instructions
Download and extract scavenger.zip.
This archive contains starter code and the data files linked below.
Write code that answers the following questions.
Solutions for the first questions are provided in the starter code.
At a minimum, complete ex2_csv() and ex2_json() for both data files.
Airlines Data¶
Airlines CSV File and Airlines JSON File
- How many unique airport codes are in the data?
- What are the unique names of all the carriers?
- Which airport had the most flights in Dec 2015?
- Which had the most security delays in one month?
Tate (Art) Data¶
Tate CSV File and Tate JSON File
- What is the oldest work of art in the collection?
- On average, how many words in are a work's title?
- Which artist has the most works of art in the data?
- How many unique artists in the data are still alive?