A research team supported by the National Institutes of Health has identified characteristics of people with long COVID and those likely to have it. Scientists, using machine learning techniques, analyzed an unprecedented collection of electronic health records (EHRs) available for COVID-19 research to better identify who has long COVID. Exploring de-identified EHR data in the National COVID Cohort Collaborative (N3C), a national, centralized public database led by NIH’s National Center for Advancing Translational Sciences (NCATS), the team used the data to find more than 100,000 likely long COVID cases as of October 2021 (as of May 2022, the count is more than 200,000). The findings appear in The Lancet Digital Health.
Long COVID is marked by wide-ranging symptoms, including shortness of breath, fatigue, fever, headaches, “brain fog” and other neurological problems. Such symptoms can last for many months or longer after an initial COVID-19 diagnosis. One reason long COVID is difficult to identify is that many of its symptoms are similar to those of other diseases and conditions. A better characterization of long COVID could lead to improved diagnoses and new therapeutic approaches.
“It made sense to take advantage of modern data analysis tools and a unique big data resource like N3C, where many features of long COVID can be represented,” said co-author Emily Pfaff, Ph.D., a clinical informaticist at the University of North Carolina at Chapel Hill.