In the past decade, data-driven technologies have transformed the world around us. We’ve seen what’s possible by gathering large amounts of data and training artificial intelligence to interpret it: computers that learn to translate languages, facial recognition systems that unlock our smartphones, algorithms that identify cancers in patients. The possibilities are endless.
But these new tools have also led to serious problems. What machines learn depends on many things—including the data used to train them.
Data sets that fail to represent American society can result in virtual assistants that don’t understand Southern accents; facial recognition technology that leads to wrongful, discriminatory arrests; and health care algorithms that discount the severity of kidney disease in African Americans, preventing people from getting kidney transplants.