What Can We Do About Biases Baked Into Data?
We believe that local data can help uncover inequities and inform decisions that support healthier communities. But what happens when the data we rely on fail to capture the social reality we imagine they do? Or when the data are flawed, incomplete—or worse, riddled with bias?
While data are critical in guiding policy and allocation decisions, it’s important to understand what data are and what data are not. Data are too often seen as objective, neutral, and accurate representations of reality. But the data points guiding our decisions are produced through human decision-making—and the bias and error that inherently comes with those decisions.
For example, while patient ratings and reviews of physicians can provide important insights, they are not an exact science and are subject to human bias on what to report and what to leave out. In fact, studies have found that patient reviews tend to be biased against physicians of color. This awareness of limitations and biases should inform decisions about how we use patient ratings as data. If we are not clear-eyed about these limitations and the conclusions that we can draw from these kinds of data, there will ultimately be consequences for healthcare organizations’ government reimbursement rates, decisions about salary and raises for individual physicians, diversity of the medical workforce, and ultimately health equity.
Another example is that when algorithmic decision-making is used in clinical medicine, a patient’s race often is included in a set of diagnostic predictors that determine treatment recommendations. Recent studies have shown, however, that such algorithms can require Black patients to be sicker than White patients before treatment is recommended.
As an explosion of new data and analytic methods is fundamentally transforming our social practices and the decisions we make as individuals, groups, and organizations, we have yet to fully come to terms with the ways data have come to shape our society and the subsequent impact on health equity—as brought to light in a new report* from the University of Chicago Crown Family School of Social Work, Policy and Practice, developed with support from the Robert Wood Johnson Foundation.
Ensuring Data is Used in a Way that Supports Our Values
If our society and particularly our decision-makers view data and analytics as objective, we miss the chance to understand social and political choices, costs, and benefits of using data. That doesn’t mean we give up on or give into mistrust of data. It does mean that we must be critical about the data we choose to use, be mindful of its limitations, and be intentional in how we make meaning from data in a way that is true to our values and serves our goals for improving society.
The good news is that there are ways to account for bias, power imbalances, and gaps in data, as well as potential privacy issues. Doing so can help us make better decisions for health and equity. Some solutions for people developing and analyzing data, as well as policymakers and organizational leaders making decisions based on data, include the following:
Balance Use of Data with Individual Freedom, Equity. and Privacy: Be aware of and set up mechanisms to address the ways new data and analytic methods used by corporations, governments, and other organizations reset the boundary between these actors’ efforts to shape the choices and opportunities we face, and individuals’ desires for equity, freedom, and privacy. We refer to the explosion of data across domains of society as “datafication”—the rendering of nearly all transactions, images, and activities into digital representations that can be stored, manipulated, and analyzed through computational processes. The rapid pace at which datafication is happening virtually ensures that regulation will inevitably lag behind practice and innovation. This sharpens the need for a robust engagement with ethics, particularly around privacy, transparency of algorithmic decision-making to ensure accountability, and fairness to ensure data-driven decision-making isn't systematically placing certain groups at a disadvantage.
The most far-reaching privacy effort to date, the General Data Protection Regulation (GDPR) of the European Union (EU), was passed in 2018 to restrict the data collected regarding EU citizens. The GDPR affirmed EU citizens’ right to digital privacy and legally requires that data only be collected for certain purposes and as minimally as feasible for those purposes. It represents the first major step by a public governing body to regulate a technology that is developing faster than relevant law and regulatory systems.
Understand Human Values and Choices Embedded in Data: Be aware of the ways that human values and choices are driving the emergence and use of data methods and data analysis. While data may seem neutral, objective, and scientific, be vigilant for ways that human decisions and biases—especially racism—can creep in.
For example, sharing and integrating data across organizations and sectors can help local leaders better understand community needs, improve services, and build stronger communities. Yet, too often in practice, when data have been shared and aggregated in this way, they have reinforced legacies of racist policies and inequitable outcomes. This raises fundamental concerns, as administrative data increasingly are used as input to inform policy, resource allocation, and programmatic decisions. To counter these pernicious effects, the Actionable Intelligence for Social Policy (AISP) program at the University of Pennsylvania created A Toolkit for Centering Racial Equity Throughout Data Integration to help users bring data together across sectors and systems in a new way. AISP aims “to create a new kind of data infrastructure—one that dismantles ‘feedback loops of injustice’ and instead shares power and knowledge with those who need systems change the most.”
Contextualize Data: Data and analytics can shape what human beings see as important, self-evident, or true. Provide context for data so they are used as a tool for decision-making rather than portraying data as the truth.
Some data efforts are flipping notions of who should define, collect, and make meaning from data to bring more equity to the ways policymakers and organizational leaders make decisions using that data. Community Noise Lab, located at the Brown University School of Public Health, is working to assess environmental exposures that create noise, air, and water pollution by working directly with community members to assess and understand exposures and implications for environmental justice. The lab has looked at the relationship between community noise and health by working directly with communities to support their specific noise issues using real-time monitoring in which residents can track instances of noise pollution using an app. Their work evaluates not only how sound affects community health but how it is measured, regulated, and reported—challenging traditional norms around who gets to create data and make meaning from that data. The project examines the potentially far-reaching exposure misclassification and equity issues in traditional environmental health studies, to better understand and address inequities in a community-centered way, and recent efforts have broadened to look at the quality of drinking water and other infrastructure challenges, based on resident priorities, to further challenge notions of who gets to decide what questions get answered with data.
Data-Driven Decision-Making Done Right
In an age of "data-driven decision-making," it's more important than ever to question the idea that data are inherently objective and unbiased. This report helps unpack how researchers, residents, and policymakers can make meaning from data in a way that is true to our values and serves our goals for improving society. Check out the rest of the featured solutions in the report for ideas on how to be more intentional about taking bias, power imbalances, gaps in data, and privacy issues into account when working with data to make better data-informed decisions for health and equity.
*The report is authored by Nicole Marwell of the University of Chicago Crown School and Cameron Day, a PhD student in the University of Chicago Department of Sociology, who explain the urgency of this issue: “If we continue viewing data and analytics as value-free and objective, we miss the chance to understand the ways in which it carries social and political choices, costs, and benefits.”
Read the new report which examines human decisions that drive the creation and analysis of data, and ideas for how to use data in making better decisions anchored in equity.
About the Authors
Nicole P. Marwell’s research examines urban governance, with a focus on the diverse intersections between nonprofit organizations, government bureaucracies, and politics.
George Hobor, senior program officer, is committed to building the capacity of the nonprofit and public sectors to use data and research in their program and policy development, and to advancing a broader conception of health that extends beyond the healthcare system.