TraCS' Emily Pfaff works with NIH to co-lead long-COVID domain team

Informatics Community Using EHR Data to Answer COVID-19 Questions

Emily Pfaff, MSIS, PhD

The National COVID Cohort Collaborative (N3C) continues to grow as a robust EHR data resource, now with over 3 million persons represented and 4 billion rows of data from more than 42 sites. Currently, 24 multidisciplinary Domain Teams composed of clinical and subject matter experts, statisticians, informaticists, and machine learning specialists are addressing the most pressing clinical questions.

N3C data can be utilized to: understand COVID-19's impact on health, collect pilot data for grant submissions, train algorithms on larger datasets, inform clinical trial design, learn how to use tools for large-scale COVID-19 data, and validate results.

The Long-COVID Clinical Domain Team aims to define and characterize patients with long-term sequelae of SARS-CoV-2 infection. These patients continue to experience several symptoms for an extended period of time after recovering from the initial effects of COVID-19 virus. A Long-COVID phenotype will support prognostic characterization of different substrata, potentially more precise care management, and greatly inform prospective interventional studies. The NIH has also just launched a new initiative to study Long COVID to help answer underlying questions surrounding this phenomenon.

Longitudinal, multimodal research is necessary for precision medicine. A use case for Long-COVID asks such questions as:

  • Who has Long-COVID?
  • How effective are existing treatments?
  • How do viral/host variants correlate with outcomes?
  • How can we best design a long-covid trial?
  • How can we deploy, evaluate, and refine care guidelines quickly and effectively over time?

Defining who will have Long-COVID poses a challenge for several reasons. The presence of the COVID-19 ICD-10 code alone is not sufficient, because we have yet to create a valid and reliable phenotype. Patients who have the ICD-10 code will be extremely heterogeneous and a poor cohort for prospective studies. To define Long-COVID and create sub-classifications, we need multimodal, longitudinal classification of patients (EHR data, imaging data, self reported data, viral & host genomic data, etc.).

The Long-COVID Domain Team is co-led by: Melissa Haendel, PhD, FACMI, Oregon Health & Science University; Emily Pfaff, MSIS, PhD, University of North Carolina Chapel Hill; Joel Saltz, MD, PhD, Stony Brook University; Christopher Chute, MD, DrPH, Johns Hopkins University; and Tell Bennett, MD, University of Colorado Anschutz Medical Campus.

Originally posted at

View news related to policies and regulations

Have news or an announcement to share? Contact Michelle Maclay at

Get NC TraCS events and news delivered to your inbox! Subscribe to our weekly email blast

NC TraCS Institute logo vertical

In partnership with:

Contact Us

Brinkhous-Bullitt, 2nd floor
160 N. Medical Drive
Chapel Hill, NC 27599

This email address is being protected from spambots. You need JavaScript enabled to view it.


Cite Us

CitE and SUBMit CTSA Grant number - UM1TR004406

© 2008-2023 The North Carolina Translational and Clinical Sciences (NC TraCS) Institute at The University of North Carolina at Chapel Hill
The content of this website is solely the responsibility of the University of North Carolina at Chapel Hill and does not necessarily represent the official views of the NIH   accessibility | contact