Reading Course: Ethics and Data Science

This is a reading course focused on the intersection of ethics and data science.

Preamble

Overview

The purpose of this reading course is to develop students who can:

Each week students will read relevant papers and books, engage with them through discussion with each other and the instructor, learn related technical skills, and bring this together through on-going assessment. All students are expected to be prepared for each week’s discussion through completing the readings and technical requirements. A specific student will act as the lead for each week.

The course outline is available here.

FAQ

Acknowledgements

Thanks to the following who helped develop this course: A Mahfouz, Assel Kushkeyeva, Irene Duah-Kessie, Ke-li Chiu, Paul Hodgetts, and Thomas Rosenthal.

Content

Week 1 - General

Ethical

Core:

Additional (pick two):

Technical

Ethical

Core:

Additional (pick two):

Technical

Week 3 - Women and gender

Ethical

Core:

Additional (pick two):

Technical

Review the essentials of Bayesian models by going through McElreath, 2020, Statistical Rethinking, 2nd Edition, (at least chapters 1, 2, 4, 7, 9, 11, 12, and 13) to address any shortcomings.

Week 4 - Race

Tom Davidson, Assistant Professor, Sociology, Rutgers University: https://youtu.be/YDmxMn2Doq0.

Ethical

Core:

Additional (pick two):

Technical

Week 5 - Natural Language Processing

Ethical

Core:

Additional (pick two):

Technical

Week 6 - AI Ethics

Shion Guha, Assistant Professor, University of Toronto, will join the discussion briefly this week.

Ethical

Core:

Additional (pick two):

Technical

Use RASA (https://rasa.com/) to build a chatbot, or OpenAI’s GPT-2 or GPT-3 to generate text.

Week 7 - Privacy

Jonathan A. Obar, Assistant Professor, Department of Communication Studies, York University, will be invited to join the discussion briefly this week.

Ethical

Core:

Additional (pick two):

Technical

Week 8 - Images/video with particular reference to facial recognition

Jeffrey Knockel, Research Associate, Citizen Lab, University of Toronto, will be invited to join the discussion briefly this week.

Ethical

Core:

Additional (pick two):

Technical

Week 9 - Corporate Surveillance

Ethical

Core:

Additional (pick two):

Technical

Week 10 - Privacy and surveillance in Canada and other countries

Lisa Austin, Professor, Law, University of Toronto, will be invited to join the discussion briefly this week.

Ethical

Core:

Additional (pick two):

Technical

Week 11 - Algorithmic decision-making

Jamie Duncan, Junior Policy Analyst, Artificial Intelligence Hub, Innovation, Science and Economic Development Canada, will be invited to join the discussion briefly this week.

Ethical

Core:

Additional (pick two):

Technical

Week 12 - History of ethical concerns broadly, and domain-specific ethical practices

Ethical (Please pick two areas.)

Medicine:

Engineering:

Statistics:

Law:

Finances:

Education:

General non-computational:

Technical

Assessment

Four ethical and technical blog posts (30 per cent)

Over the course of the term, you are expected to submit four blog posts that each comprise two aspects: 1) ethical and 2) technical. These two aspects should be related to each other. You must submit all four, but only your best three blog posts will count, that is each blog post will account for 10 per cent of your overall mark.

For the first aspect (ethical), you are expected to write a moderate length discussion (think a paper of about two to three pages), of a reading, or set of readings, that we have covered over the past two weeks. Strong submissions will not limit themselves to reviewing a reading but will draw in larger issues and detail their own point of view.

For the second aspect (technical), you are expected to implement some small related technical aspect of what we have covered in the past two weeks. For instance, if we covered natural language processing then you may critically review a paper, and put together a chat bot.

To be clear, these two aspects should be related, tied together, and should be in the one blog post.

You should submit your blog post by emailing me a link to the relevant blog post on your website.

The proposed specific list of deadlines is:

In Week 1 we will discuss how these dates fit in with your other commitments and finalise them at that point.

The instructor will make the marking guide available at least a week before the submission deadline.

Paper 1 (30 per cent)

Task

Please gather and clean data on UofT salaries from the Sunshine List. Then conduct a Bayesian statistical analysis of your dataset to discuss the extent to which gender has an effect on salary. Finally please prepare a paper of around 10 pages that discusses your analysis. (Hint: gender is not explicitly part of the Sunshine List, you will need to grapple with what to do.)

Background

You should make appropriate use of appendices for additional and supporting material, and thoroughly reference your paper, but neither the appendices nor the reference list count toward your page limit. Your paper should have an appropriate title, author, date, abstract, and introduction. It should document and overview your dataset. It should clearly specify your model, and then discuss the results of your analysis and any weaknesses. Your analysis should be fully reproducible, with code and data hosted on a public GitHub repo. Additionally, you should include a thorough discussion of ethical considerations relevant to your analysis. This would likely take at least three pages, but you are welcome to write as much as is needed to make the points you would like to make. Likely the best way to do this is to include a brief overview of the ethical points that you would like to discuss, and then include the rest of the discussion in an appendix. I understand that Bayesian analysis may be new to you. I will assist you with putting together the model, but it is up to you to understand and interpret the output.

Submission

To submit your paper you should email me a link to a public GitHub repo. That repo should contain your paper in PDF format and all supporting code and data. Please send this email by midnight, Sunday, 14 February, 2021. Please do not make any changes to the repo after this. I will make the marking guide available at least a week before the submission deadline.

Paper 2 (40 per cent)

Task

In consultation with me, please identify an appropriate research question and data source that, like the requirement for Paper 1, combines both ethical and technical aspects. Please prepare a paper that represents your best attempt to answer this question and shows off your ability to engage in thoughtful, ethical, critique. The paper should be as long as necessary, although all extraneous material should be included in appendices. The expectation is that this paper should make an original contribution, that could be published in an academic journal.

Background

Please see the background provided for Paper 1, as this applies for Paper 2 as well.

Submission

You must send the email with the GitHub link to me by midnight, Sunday, 23 April, 2020. Please do not make any changes to the repo after this. I will make the marking guide available at least a week before the submission deadline. No extensions are possible because of deadlines for instructors to submit grades.