Courses

History of statistics

Last updated: 2024-04-05

This course may be taken as a reading course in DoSS or Information. Please get in touch if interested.

Overview

Statistics and the data sciences have a long and robust history. Understanding that history provides students with a better appreciation for the methods that they are applying today.

Often students are taught, say, linear regression in such a way that they come to believe that statisticians simply stumbled upon it one day. In fact, the idea of combining different observations in this way, took the work of decades and even centuries to come to terms with. Understanding the history of statistics and data sciences, more generally, provides a more solid foundation for applying those skills today. We are interested in why certain methods were developed, and became popular, and the circumstances under which this occurred because that provides us with a nuanced knowledge of when we should apply them ourselves.

We study history because we want to understand how our predecessors solved their problems. That means understanding, not just what they did, but the circumstances in which they did it, and the choices they faced. That knowledge allows us to better solve our own problems. At the very least, it helps us to avoid repeating mistakes; and, if fully accomplished, can even allow to improve our own approaches.

The history of statistics and the data sciences is one of greatness, and we will cover that extensively. But it also one in which that greatness was sometimes developed for abhorrent purposes, and there were many contributors, actual or potential, who were overlooked. We will cover these aspects too.

The hope is that having taken this course, you will understand what you have been studying in statistics and the data sciences with fresh eyes, and bring this deeper appreciation with you throughout the rest of your career.

Learning objectives

The purpose of the course is to develop an appreciation of history of statistics and the data sciences to such an extent so as to provide a firmer foundation for your conduct of applied statistics and data science. By the end of the course, you should be able to:

  1. Engage critically with ideas and readings in the history of statistics and data sciences.
  2. Conduct research in the history of data science and statistics.
  3. Write and present your research.
  4. Understand why the methods and approaches developed when they did, and the circumstances under which they developed.
  5. Appreciate that much of the statistical machinery that we use today was developed with respect to eugenics.
  6. Respectfully identify strengths and weaknesses in the work of others.
  7. Reflect effectively on your own learning and professional development.

Content

Week 1 "Overview"

Early astronomical and gambling underpinnings. Least squares, combining observations, and uncertainty. Legendre, Laplace, Bernoulli, De Moivre, Simpson.

Week 2 "The 1700s"

Inverse probability. Gauss, Laplace, Central Limit Theorem.

Week 3 "Early 1800s"

Adoption by the social sciences. Quetelet, Poisson, Cournot, Lexis, binomials and Law of Large Numbers.

Week 4 "Late 1800s"

Adoption by eugenics. Galton, Edgeworth, and Pearson. Regression and correlation.

Week 5 "Early 1900s"

Edgeworth, Pearson, and Yule. Regression, and correlation.

Week 6 "Early 1900s (cont.)"

Week 7 "Data visualization"

Week 8 "Bayesian methods"

Week 9 "Causal inference"

Week 10 "Whither statistics? The rise of data science"

Week 11 "Overlooked contributors"

Week 12 "Reckoning with the past and thinking about the future: Statistics and society"

Assessment

Tutorial papers

Final Paper