Materials

Week 1

Readings:

Monday: Course introduction [slides]

Lab sections: Orientation to Jupyter notebooks [html]

Wednesday: Data science lifecycle [slides]

Week 2

Readings:

Assignments:

  • HW1, BRFSS case study, due Monday, April 24 [html]

Monday: Tidy data [slides]

Lab sections: Pandas [html]

Wednesday: Dataframe transformations [slides]

Week 3

Readings:

Assignments:

  • Mini project 1, due Monday, May 1 [html]

Monday: Sampling, bias, and missingness [slides]

Lab sections: Exploring sampling bias through simulation [html]

Wednesday: Voter fraud case study [slides] [activity html]

Week 4

Readings:

  • Wilke, Fundamentals of Data Visualization Ch. 2-5
  • LDS11.1 Choosing scale to reveal structure
  • (Recommended) Cook, D., Lee, E. K., & Majumder, M. (2016). Data visualization and statistical graphics in big data analysis. Annual Review of Statistics and Its Application, 3, 133-159. [link to paper]
  • (Recommended) Gelman, A., & Unwin, A. (2013). Infovis and statistical graphics: different goals, different looks. Journal of Computational and Graphical Statistics, 22(1), 2-28. [link to paper]
  • (Recommended) Iliinsky, N. (2010). On beauty. Beautiful visualization: Looking at data through the eyes of experts, 1-13. [link to chapter]

Assignments:

  • HW2, SEDA case study, due Monday, May 8 [html]

Monday: Statistical graphics [slides]

Lab sections: Data visualization [html]

Wednesday: Principles of figure design [slides]

Week 5

Readings:

Monday: Exploratory analysis and density estimation [slides]

Lab sections: Smoothing [html]

Wednesday: Multivariate KDE, mixture models, and scatterplot smoothing [slides] [activity html]

Week 6

Readings:

Assignments:

  • HW3, Diatom paleoclimatology case study, due Monday, May 22 [html]

Monday: Covariance, correlation, and spectral decomposition [slides]

NO lab sections this week

Wednesday: Principal components [slides]

Week 7

Readings:

Assignments:

  • Mini project 2, due Tuesday, May 30 [html]

Monday: Modeling concepts; least squares [slides]

Lab sections: Principal components [html]

Wednesday: The simple linear regression model [slides]

Week 8

Readings:

Assignments:

  • HW4, Discrimination in disability benefit allocation, due Wednesday, June 7 [html]

Monday: Prediction [slides]

Lab sections Fitting regression models [html]

Wednesday: Multiple regression [slides]

Week 9

Readings:

Assignments:

  • Course project due Friday, June 16 [html]

No class or lab sections Monday

Wednesday: Classification [slides]

Week 10

No readings or new assignments

No class Monday

Lab sections: Logistic regression (submission is optional) [html]

Wendesday: Clustering [slides]