7 June 2021: I am on leave through to September.
I am an assistant professor at the University of Toronto in Information and Statistical Sciences. I am also a faculty affiliate at the Schwartz Reisman Institute for Technology and Society. I hold a PhD in Economics from the Australian National University where I was supervised by John Tang (chair), Martine Mariotti, Tim Hatton, and Zach Ward.
I am interested in using statistical models to try to understand the world. And particularly how we get the data that go into those models; whose data are systematically missing; how we clean, prepare, and tidy data before they are modelled; the effects of all this on the implications of our models; and how we can reproducibly share the totality of this process. This research interest has a few different applications. One of those is Natural Language Processing (NLP), where I am interested in understanding the effects of bringing together large, biased datasets and enormous models, and how this can be improved. Another is Multilevel Regression with Post-stratification (MRP), where I examine the effects of trying to establish a correspondence between two datasets.
My book on foundational skills in data science, tentatively titled Telling Stories With Data, was accepted for publication by CRC Press in May 2021. And I am co-editor (alongside Lauren Kennedy and Andrew Gelman) of a book about MRP tentatively titled Multilevel Regression and Poststratification: A Practical Guide and New Developments, which was accepted for publication by Cambridge University Press in July 2021.
Students in my research group develop skills not only in using statistical methods in reproducible ways across various disciplines, but also appreciate their limitations, and think deeply about the broader context of their work. Some recent papers include: ‘heapsofpapers’, ‘Detecting Hate Speech with GPT-3’, and ‘On consistency scores in text data with an implementation in R’.
I enjoy teaching and aim to help students from a wide range of backgrounds learn how to use data to tell convincing stories. In the Faculty of Information, I have taught ‘Experimental Design’ and lead reading courses in ‘Ethics and Data Science’, ‘Information Management in Interdisciplinary Research’ and ‘Reproducible Data Science’. In Statistical Sciences I have taught ‘Surveys, Sampling, and Observational Data’ and lead a reading course in ‘Natural Language Processing’. I am a RStudio Certified Tidyverse Trainer.
I am married to Monica Alexander, and we have two young children. I probably spend too much money on books, and certainly too much time at libraries (in a pre-COVID world). You can see some of the books that I recommend here. If you have any book recommendations of your own, then I’d love to hear them.