There’s a saying: numbers don’t lie. But while numbers don’t lie, they often reflect the assumptions, beliefs, and biases of the people who do the math. Before we can count, we first have to decide what should be counted—we count what we think matters. So who decides what matters? Who counts, and who decides how and what to count? In this talk, we’ll consider how we come to judgments about what counts, and how we decide to count—and who gets to decide what counts. We’ll talk about how we use counting to decide what matters, and how we might reconsider what might matter in our deciding how to count.
Emily Barnes Franklin
Harnessing Data Science for Improved Chemical Characterization of the Atmosphere
Organic compounds in the atmosphere play an important role in both climate change and public health. Direct emissions transform into secondary products during atmospheric oxidation, with each generation of products more complex than the ones that preceded it. The number of organic compounds in the atmosphere is estimated to be in the millions, and the vast majority of these compounds have never been synthesized in a lab and cannot be definitively identified. Analytical instrumentation has rapidly evolved to produce increasingly large, high resolution datasets describing this complexity. This has created significant challenges for researchers, and results in underutilization of the full scope of information made available by advances in instrumentation. Increased application of data science in observational atmospheric chemistry through collaborations between data scientists and chemists represents a significant opportunity for improving our understanding of the chemistry of the atmosphere, a critical step towards combating air pollution and climate change.
Data Accountability: The Impact of Privacy, Data Protection, and Quality of Data for Artificial Intelligence
AI products such as GPT-3 (based on text) and DALL-E (based on images) are currently the focus of intense scrutiny and hype. But none of the current crop of AI products would exist without access to massive amounts of data. This talk will present a brief overview of an emergent area I call “data accountability” that critically examines the sources of data for data-centric AI systems. While privacy and data protection regulations will directly impact what data can be used by these systems, there are increasingly critical questions being raised about the quality of data that feeds them: sourcing, labeling, consent, and the lifecycle of data, including deletion. Data scientists will need to play a critical role in addressing these issues as AI continues to be developed.