Skip to main content

Frequently asked questions about the Carolina Data Warehouse for Health including: What systems feed into the CDW-H? What data are in the CDW-H? What are the characteristics of the population of patients in the UNC CDW? How much history is available in the CDW-H? What institutions are represented in the CDW-H? and more...

CDW-H Frequently Asked Questions

What systems feed into the CDW-H?
What data are in the CDW-H?
What are the characteristics of the population of patients in the UNC CDW?
How much history is available in the CDW-H?
What institutions are represented in the CDW-H?
How is the CDW-H managed and who provides oversight?
What are the steps involved in requesting data from the CDW-H?
After I submit a request, how long will it take to get my data?
Will my research datasets contain any "restricted" patient records?
What is the cost associated with CDW-H requests?
Can students and residents access CDW-H data for research?
How do I access my data?
What are the requirements and policies for sharing clinical data outside of UNC?
What is the process for linking clinical data and claims data?
Can I use the CDW-H to identify healthy cohorts for recruitment?
How do I explain to a subject how his/her name was obtained?
Where can I find the required recruitment templates for use with CDW-identified patients?
How to I indicate Recruitment via the CDW-H in my IRB application?
Will incarcerated patients show up in my dataset?
Can CDW analysts assist with QI projects?

What systems feed into the CDW-H?

The CDW-H gets a nightly data feed from Epic, though it also contains legacy data from WebCIS (legacy EMR), Siemens Invision, and GE Centricity (legacy financial systems).

What data are in the CDW-H?

The CDW-H broadly contains data in the following domains: patient demographics, encounters, diagnoses, procedures, medications, labs, full-text notes, and financials. As a general rule, if data are collected in Epic (or were collected in WebCIS), we can work with you to pull it from the CDW-H.

What are the characteristics of the population of patients in the UNC CDW?

To get an idea of the demographics of our patient population and common diagnoses, please review the UNC PCORnet DataMart Table 1 (pdf).

How much history is available in the CDW-H?

The CDW-H contains WebCIS and Siemens data (clinical and hospital billing data) from mid-2004 to the April 2014 Epic Go Live. Legacy professional billing data from GE Centricity are available from mid-2008 to April 2014. Epic data are available from 4/4/2014 to the present.

What institutions are represented in the CDW-H?

Legacy CDW-H data (prior to the 4/4/2014 Epic Go Live) covers UNC Medical Center patients and encounters, as well as patients and encounters in any affiliated practices that used WebCIS as their EMR. For post-Epic data, the CDW-H includes data from all UNC-affiliated hospitals and practices that are live on Epic. As of late 2014, this includes the UNC Physicians Network, Rex Hospital, and Chatham Hospital, and will continue to grow in scope as more of our affiliates Go Live.

How is the CDW-H managed and who provides oversight?

The CDW-H is managed by a Governance Board appointed by the CEO of the UNC Health Care System. The Governance Board includes the CDW-H Oversight and CDW-H Operations Committees. Project requests are subject to approval by the governance committees because of the sensitive nature of the protected health information (PHI) in the CDW-H.

What are the steps involved in requesting data from the CDW-H?

Data requests to the CDW-H may range from straightforward (such as a one-time request for a recruitment cohort) to complex (such as natural language processing, or a custom patient registry). All requests, however, require approval. The following table summarizes the steps to request data from the CDW-H based on the level of data access required.

Level Description Scope of Data Process
1 De-identified aggregated data An aggregated count of patients (e.g., 565 patients have this diagnosis), optionally broken out into smaller groups (e.g., 140 female patients and 347 male patients had this procedure). 1
2 De-identified data set A dataset at the patient level, with all HIPAA identifiers removed or obscured. 2
3 HIPAA limited data set A dataset at the patient level, with all HIPAA identifiers removed or obscured other than dates of service/birth/death; city, state, and/or zip code; and patient ages. 2
4 Identified data set A dataset at the patient level that includes any HIPAA identifiers beyond the limited dataset fields. 2

Process 1

Level 1 does not require a formal project request, or IRB approval. Request a consult with an analyst to get started.

Process 2

Levels 2 through 4 require the following steps to be completed:

  1. The study must be covered by an approved IRB protocol. (Specifically, make sure that your answer to IRB section C.1 reflects that you will be using medical record data as part of your study.)
  2. Submit a CDW-H Project Request form by the last Thursday of the month for consideration at the next committee meeting (held on the first Thursday of each month). It is usually helpful to request a consult with an analyst before submitting a project request, to get help with feasibility, scope, and time and cost estimates.

After I submit a request, how long will it take to get my data?

How long a given request takes depends on our current backlog, as well as the specifics of your request. Requests can be completed in days, weeks, or months, depending heavily on the size and scope of the project. We will always work with you to try and meet your needs as best we can. Regardless, when requests are received, they are assigned points according to the scheme below, and worked in order from highest total point value to lowest total point value:

For Prep-to-Research Requests (aggregate counts only)

Score Reason
+1 Request will take < 4 hours of analyst time to complete.
+1 Request is in support of a grant application.
-1 Request could reasonably be completed in i2b2 by the requester.
varies Request has a hard deadline that necessitates expedience. (Modifier varies based on deadline.)
+1 If no hard deadline: request has been waiting in the queue for 4 weeks. (+2 for 8 weeks, +3 for 12 weeks, etc.)

For Data Requests

Score Reason
+1 Request will take < 4 hours of analyst time to complete.
+2 Project has set aside appropriate funding for informatics work at our recharge rate of $125/hour.
+1 Requester provides analyst with a usable i2b2 query to identify their cohort.
varies Request has a hard deadline that necessitates expedience. (Modifier varies based on deadline.)
+1 If no hard deadline: request has been waiting in the queue for 4 weeks. (+2 for 8 weeks, +3 for 12 weeks, etc.)

After you submit, we will notify you via email once an analyst begins work on your request, but please feel free to check in on the status of your request at any time.

We reserve the right to decline a request for data is the deadline is deemed unreasonable in the context of the scope of work being requests. If this is the case, we will alert a requester as soon as possible so alternative arrangements can be made.

Will my research datasets contain any "restricted" patient records?

Epic has a feature called "Break the Glass" that warns users that they are about to access a patient's record that has been marked as restricted. Records can be marked restricted for a variety of reasons, including pending litigation, "VIP" or celebrity status, or simply because the patient has requested it. The fact that a patient's record is "behind glass" does not mean you are not allowed to view the record, but does mean you must document a valid reason for doing so. Break-the-Glass also does not invalidate the UNC Health Care Notice of Privacy Practices (acknowledged by all UNCHCS patients), which states that PHI may be disclosed for IRB-approved research without specific consent. In other words, Break the Glass patients should not be automatically excluded from research. For this reason, patients whose records are part of Break the Glass in Epic may appear in research datasets provided from the CDW-H.

If you perform chart reviews after receiving a CDW-H dataset and see a Break the Glass warning, you'll need to take a few additional steps to access the record. You will be asked to authenticate with your Epic username and password, and provide a reason why you are viewing the patient's record. In such cases, choose "Record Review" from the list of options, and document in the "Further explanation" field that the access is for IRB-approved research, documenting the specific study.

What is the cost associated with CDW-H requests?

After an initial consultation, the CDW-H analysts can provide time estimates for the amount of work requested. If the requester changes the scope of work after receiving an initial estimate, the assigned analyst will provide a revised estimate to cover the new scope of work.

CDW-H rates are $125/hour. If your project is unfunded, we offer some no-cost and low-cost data options. Please request a consult to discuss options.

We cannot release data for funded requests until we have received a Chartfield string (or alternative billing method) from the study team.

Can students and residents access CDW-H data for research?

We are happy to accept research requests from students and residents so long as they are able to provide funding.

How do I access my data?

Access to your data set is primarily via the TraCS Secure Network Drive. Your data and all derivative data sets MUST reside on this secure server for the duration of your study (in agreement with the language of our Data Access Agreement). Depending on your needs, we can also provision datasets within the UNC Secure Research Workspace.

What are the requirements and policies for sharing clinical data outside of UNC?

To learn more about sharing clinical data outside of UNC, please review our Clinical Data Sharing FAQ.

What is the process for linking clinical data and claims data?

To learn more about the data linkage process and policies, please review our Data Linkage page.

Can I use the CDW-H to identify healthy cohorts for recruitment?

In general, the CDW-H is best suited to find cohorts with relatively narrow inclusion/exclusion criteria. This is especially true for recruitment—the more targeted your search, the more likely the patients you retrieve will actually qualify for your study. Extremely broad criteria, especially targeting "healthy" patients, could result in tens to hundreds of thousands of qualifying patients. (E.g., "Patients >= age 18 who do not have a depression diagnosis.") Additionally, it is quite difficult to define "healthy" status using the EHR. For this reason, we generally do not approve requests to identify healthy cohorts for recruitment using the CDW-H, and recommend alternative recruitment methods for this purpose.

How do I explain to a subject how his/her name was obtained?

Please see the guidance document posted here: Answers to Patient FAQs_CDW.docx

Where can I find the required recruitment templates for use with CDW-identified patients?

In order to recruit patients identified via the CDW-H, study teams must use the templates that have been approved by the CDW-H, UNC Research, and UNC Health. The templates are largely customizable, but do have some non-negotiable language. Where included, the UNC Health logo may be retained with the approved language. Find the approved recruitment templates here (step #2).

How to I indicate Recruitment via the CDW-H in my IRB application?

Before a patient list will be released to you, you must have appropriately indicated use of the CDW-H for recruitment in your IRB application. Learn what needs to be modified here (step #3).

Will incarcerated patients show up in my dataset?

Research involving incarcerated persons requires special regulatory review to ensure these individuals are appropriately protected. Permission to analyze data on prisoners or recruit incarcerated patients is granted by the IRB, not by TraCS. This extends to research involving existing data, such as CDWH. Policy varies depending on the intended use of your dataset:

  • Recruitment – Any individual who has a known current prisoner/incarcerated status is excluded from recruitment datasets, except in circumstances when approval to conduct research with prisoners has been specifically granted.
  • Secondary data analysis – Datasets may include prisoners provided their inclusion is incidental and the study is not actively targeting prisoners. However, indication of the patient's incarcerated status must not be included in the dataset. Incarcerated status may be revealed through frequently requested data elements, including patient address and payor. If a researcher requests address or granular payor information, any individual who has a known current prisoner/incarcerated status will be excluded from the dataset.

If a study is specifically targeting prisoners (e.g., research question focuses on incarceration, inclusion criteria includes incarcerated status), approval to conduct research with prisoners must be specifically granted by the IRB. The utilization of prisoner data may also require certification by OHRP before data can be accessed, this process will be facilitated by the UNC OHRE, and the IRB approval letter will identify if this requirement needs to be completed or that it has been met.

Out of ~1 million active UNCHCS patients, about 8,000 are treated while on prisoner/incarcerated status. "Prisoner status" will be determined at the time data is provisioned to the researcher. If a person becomes incarcerated after a dataset in which they appear is provisioned, the dataset will remain unchanged. In cases where parts of a large dataset are pulled over a long period of time (e.g., weeks to months), "prisoner status" will be determined as of the date the patient cohort is initially defined (usually the first step in the data process).

Can CDW analysts assist with QI projects?

The Carolina Data Warehouse for Health (CDW-H) is indeed used for both UNCHCS operations as well as research. The TraCS analysts are funded through the NIH Clinical and Translational Science Award (CTSA) and their activities are focused on research. UNCHCS and some departments have multiple quality improvement programs as well as dedicated analysts. For QI projects, we recommend starting with your division or departmental quality improvement coordinator. Research regarding methods to improve care quality may qualify as appropriate for TraCS and we can discuss those issues on a case by case basis.

SUBMIT A REQUEST

related services