Reports  |    |  July 1, 2019

Gender Diversity in AI Research

Report prepared by Nesta. Written by Kostas Stathoulopoulos and Juan Mateos-Garcia. 32 pages.

Summary:

Lack of gender diversity in the Artificial Intelligence workforce is raising growing concerns, but the evidence base about this problem has until now been based on statistics about the workforce of large technology companies or submissions to a small number of prestigious conferences.

We build on this literature with a large-scale analysis of gender diversity in AI research using publications from arXiv, a widely-used preprints repository where we have identified AI papers through an expanded keyword analysis, and predicted author gender using a name-to-gender inference service. We study the evolution of gender diversity in various disciplines, countries and institutions, finding that while the share of female co-authors in AI papers is increasing, it has stagnated in disciplines related to computer science. We also find that geography plays an important role in determining the share of female authors in AI papers and that there is a severe gender gap in the top research institutions. We also study the link between female authorship in papers and the citations it receives, finding a strong, positive correlation in research domains related to the impact of information technology on society. Having done this, we examine the semantic differences between AI papers with and without female co-authors. Our results suggest that there are significant differences in machine learning and computer ethics between the United States and the United Kingdom as well as differences in the research focus of papers with female co-authors. We conclude by reporting the results of interviews with female AI researchers and other important stakeholders aimed at interpreting our findings and identifying policies to improve diversity and inclusion in the AI research workforce.

Table of Contents

  • Summary
  • Introduction
  • Data collection and pre-processing
    • arXiv
    • Geocoding affiliations
    • Gender classification
    • AI labelling
    • Discipline clustering
  • Analysis
    • Descriptive analysis
    • Drivers of gender diversity
    • Effects of gender diversity
  • Interview results
  • Discussion
  • References and endnotes

Lack of gender diversity in the artificial intelligence (AI) workforce is raising growing concerns. Our analysis shows that there is a gender diversity gap in AI research, in a larger and more comprehensive corpus than those which have been used to study this important issue before.

Key findings

  • There is a serious gender diversity crisis in AI research.
    Only 13.83 per cent of authors are women and, in relative terms, the proportion of AI papers co-authored by at least one woman has not improved since the 1990s.
  • Location and research domain are significant drivers of gender diversity
    Women in the Netherlands, Norway and Denmark are more likely to publish AI papers while those in Japan and Singapore are less likely to. The UK is 22nd on this list, with 26.62 per cent of AI papers having at least one female co-author. Women working in physics, education, computer ethics and other societal issues, and biology, are more likely to publish work on AI in comparison to those working in computer science or mathematics.
  • There is a significant gender diversity gap in universities, big tech companies and other research institutions
    Apart from the University of Washington, every other academic institution and organisation in our dataset has less than 25 per cent female AI researchers. In big tech, only 11.3 per cent of Google’s employees who have published AI research on arXiv are women. The proportion is similar for Microsoft (11.95 per cent) and slightly better for IBM (15.66 per cent).
  • There are important semantic differences between AI papers with and without a female co-author.
    When examining publications on machine learning and societal topics in the United Kingdom in 2012 and 2015, those involving at least one female co-author tend to be more semantically similar to each other than those without any female authors. Papers with at least one female co-author also tend to be more applied and socially aware, with terms such as fairness, human mobility, mental, health, gender and personality being among the most salient ones.

Our blog, How diverse is the workforce of AI research, looks in more detail at interviews we carried out with experts in the field. We discuss how our findings resonated with their experience in the sector and explore the cultural and institutional factors that determine gender diversity in AI research.

Additional information at https://www.nesta.org.uk/report/gender-diversity-ai/