Online Courses  |    |  August 19, 2020

University of San Francisco Data Institute: Applied Data Ethics

Free online course from USF Applied Data Institute and and taught by Rachel Thomas, PhD. This course contains essential knowledge for anyone working in data science or impacted by technology. The course focus is on topics that are both urgent and practical, causing real harm right now. In keeping with the teaching philosophy, it begins with two active, real-world areas (disinformation and bias) to provide context and motivation, before stepping back in Lesson 3 to dig into foundations of data ethics and practical tools. From there it moves on to additional subject areas: privacy & surveillance, the role of the Silicon Valley ecosystem (including metrics, venture growth, & hypergrowth), and algorithmic colonialism.

There are no prerequisites for the course. It is not intended to be exhaustive, but hopefully will provide useful context about how data misuse is impacting society, as well as practice in critical thinking skills and questions to ask. This class was originally taught in-person at the University of San Francisco Data Institute in January-February 2020, for a diverse mix of working professionals from a range of backgrounds (as an evening certificate courses).


Data ethics covers an incredibly broad range of topics, many of which are urgent, making headlines daily, and causing harm to real people right now. A meta-analysis of over 100 syllabi on tech ethics, titled “What do we teach when we teach tech ethics?” found that there was huge variation in which topics are covered across tech ethics courses (law & policy, privacy & surveillance, philosophy, justice & human rights, environmental impact, civic responsibility, robots, disinformation, work & labor, design, cybersecurity, research ethics, and more– far more than any one course could cover). These courses were taught by professors from a variety of fields. The area where there was more unity was in outcomes, with abilities to critique, spot issues, and make arguments being some of the most common desired outcomes for tech ethics course.

There is a ton of great research and writing on the topics covered in the course, and it was very tough for me to cut the reading list down to a “reasonable” length. There are many more fantastic articles, papers, essays, and books on these topics that are not included here.

Lesson 1: Disinformation

From deepfakes being used to harass women, widespread misinformation about coronavirus (labeled an “infodemic” by the WHO), fears about the role disinformation could play in the 2020 election, and news of extensive foreign influence operations, disinformation is in the news frequently and is an urgent issue. It is also indicative of the complexity and interdisciplinary nature of so many data ethics issues: disinformation involves tech design choices, bad actors, human psychology, misaligned financial incentives, and more.

Runtime 86 min.

Lesson 2: Bias & Fairness

Unjust bias is an increasingly discussed issue in machine learning and has even spawned its own field as the primary focus of Fairness, Accountability, and Transparency (FAccT). We will go beyond a surface-level discussion and cover questions of how fairness is defined, different types of bias, steps towards mitigating it, and complicating factors.

Runtime 78 min.

Lesson 3: Ethical Foundations & Practical Tools

Now that we’ve seen a number of concrete, real world examples of ethical issues that arise with data, we will step back and learn about some ethical philosophies and lenses to evaluate ethics through, as well as considering how ethical questions are chosen. We will also cover the Markkula Center’s Tech Ethics Toolkit, a set of concrete practices to be implemented in the workplace.

Runtime 50 min.

Lesson 4: Privacy and surveillance

Huge amounts of data are being collected about us: apps on our phones track our location, dating sites sell intimate details, facial recognition in schools records students, and police use large, unregulated databases of faces. Here, we discuss real-world examples of how our data is collected, sold, and used. There are also concerning patterns of how surveillance is used to suppress dissent and to further harm those who are already marginalized.

Runtime 68 min.

Lesson 5: How did we get here? Our Ecosystem

News stories understandably often focus on one instance of a particular ethics issue at a particular company. Here, I want us to step back and consider some of the broader trends and factors that have resulted in the types of issues we are seeing. These include our over-emphasis on metrics, the inherent design of many of the platforms, venture capital’s focus on hypergrowth, and more.

Runtime 20 min.

Lesson 6: Algorithmic Colonialism, and Next Steps

When corporations from one country develop and deploy technology in many other countries, extracting data and profits, often with little awareness of local cultural issues, a number of ethical issues can arise. Here we will explore algorithmic colonialism. We will also consider next steps for how students can continue to engage around data ethics and take what they’ve learned back to their workplaces.

Runtime 57 min.

©2020 All rights reserved