Clinical Data Sharing Frequently Asked Questions

Can I transfer patient information to another institution?
My dataset doesn't include names; does that mean it is deidentified and can be shared?
My dataset doesn't include any of the 18 HIPAA identifiers; can it be shared?
How much data can I transfer?
What if the data I want to share contains patient names?
What if the data I want to share contains patient addresses?
What if the patient gives written permission to share their personal identifiers including name?
Can I use a "dummy" identifier in place of name, address, social security number, or medical record number?
What is a hashed identifier?
Can a hashed identifier be hacked?
Are there other approaches to conduct multi-site research without sharing patient data?
How much will these methods cost?
Do I need to know how to write a data use agreement?
Who signs the data use agreement?
Who do I contact if I need to execute a data use agreement?

Can I transfer patient information to another institution?

Yes, under certain circumstances. Counts of patients (e.g., "UNC cared for 102,084 people with type 2 diabetes in 2018") have no regulatory restrictions other than a general practice of not transmitting very low counts (less than 10 patients in a given count "cell") outside the institution. If you are sharing any "patient-level" data (i.e., non-aggregated data, such as all HbA1c results for a cohort of patients), regardless of whether or not it is identified, you will need appropriate approvals and documentation. Typically, this means having an IRB for the project and an executed data use agreement (DUA). The Office of University Counsel has produced guidance on DUAs for investigators.

These tasks are manageable and supported by offices at UNC (e.g., IRB, TraCS, OIC), but it is important to make sure that patient data is transferred only when approved and when the use is appropriate to address research questions and/or improve care quality. If you are ever not sure whether an IRB or DUA is needed for a Carolina Data Warehouse request, contact This email address is being protected from spambots. You need JavaScript enabled to view it. before transferring any information so as to avoid the potentially dire consequences of a data breach. For any other data sharing activities for which you need guidance, contact the IRB or the Office of Industry Contracting.

My dataset doesn't include names; does that mean it is deidentified and can be shared?

Even if a dataset does not include names, it still may be considered an identified dataset. If the data contain any of the 18 HIPAA identifiers (hhs.gov/hipaa) including dates of service, date of birth, or zip code, it is considered protected health information.

A dataset that contains dates of service, date of birth, or zip code, but no other identifiers, is called a "HIPAA limited data set." Even though such information does not contain patient names, those variables can be relatively easily used to identify individuals unless managed under the regulatory framework of an IRB and a data use agreement. A HIPAA limited data set is not considered deidentified.

My dataset doesn't include any of the 18 HIPAA identifiers; can it be shared?

Use of any patient level data for research purposes requires that the work be done under an IRB. In many cases, UNC requires an agreement to cover data sharing of data even when it is deidentified.

How much data can I transfer?

Technically, transfer of substantial amounts of data is not a problem. However, HIPAA requires that only the minimum necessary data be shared. Therefore, the more data one transfers, the greater the justification one will need to make for the transfer. Can the analyses be performed locally? What is the minimum amount of data needed to answer the research question? Consider both the scope of your patient population and the breadth of the data you are requesting and/or sharing. Anticipate those questions as you prepare your data request and data use agreement.

What if the data I want to share contains patient names?

Personal identifiers such as name, address, medical record number, etc. are often called "personally identifiable information" (PII) or "direct identifiers." Due to the nature of the data and the risk to the patient of disclosure in case of a breach, there is need for a greater level of justification of the need for transfer of such data. There may be justifications for transmission of fully identified data, but as with all patient-level data sharing, a data use agreement (and all other appropriate approvals) is required.

What if the data I want to share contains patient addresses?

Addresses or latitude/longitude "geocodes" will be treated the same as patient names (i.e., as a "direct identifiers"). However, use of geographic units of analysis such as zip code can be very helpful in assessing social determinants of care, effect on access to care of distance from the clinic or hospital etc. Whenever possible, we prefer to transform geocodes (for example) to distance from the hospital or pre-link desired socio-economic status variables to the dataset (and then drop the location variable after linkage).

What if the patient gives written permission to share their personal identifiers including name?

This commonly is the case for consented patient registries or for consented clinical trials. Data is transferred to the data coordinating center, which may be at another institution. One still needs to obtain IRB approval, and ensure appropriate agreements are in place (such as a data use agreement) to make sure that the patient information will be stored in a secure manner, not transferred inappropriately, etc.


Privacy-Preserving Methods

Can I use a "dummy" identifier in place of name, address, social security number, or medical record number?

Yes, a study ID or "hashed identifier" that cannot be linked back to the patient is very useful. By keeping the crosswalk back to the patient's identity at the home institution, the patient's privacy is more secure.

What is a hashed identifier?

A "hash" is like a fingerprint for data. Hashes uniquely identify strings of letters and numbers, but obscure the actual letters and numbers. A "hash function" is a formula used to create a hash. It defines the logic used to transform a string into a hash. Hashing is much more reproducible than using an arbitrary study ID, and is more secure than transferring patient direct identifiers such as names. We have gained experience in these methods including use of linkage algorithms that facilitate linkage or de-duplication of patient lists, and recommend their consideration. Consultation with the TraCS IDSci team or with Biostatistics is recommended to make sure that this technique will suit your research. Visit tracs.unc.edu/consultation to request a consult with an analyst.

Can a hashed identifier be hacked?

This method is a current "state of the art" for preserving patient privacy in data linkage projects in which data needs to be merged. While one can never say with 100% certainty that a data security method is secure, these methods are quite good since the "key" to the patient identifier is held by an honest broker who created the hash.

Are there other approaches to conduct multi-site research without sharing patient data?

Another approach is to use fully distributed data analytics, in which all patient-level data remains at the host institution. If the data elements are harmonized in a common data model, analytic code can be distributed to multiple sites and executed separately at each site, sending only results (i.e., counts, summary statistics) to the coordinating center. UNC has participated in multiple such projects, but we strongly recommend that you obtain consultation to make sure such an analysis is appropriate for your proposed project before committing. Visit tracs.unc.edu/consultation to request a consult with an analyst.

How much will these methods cost?

Some of these methods are relatively inexpensive, but please obtain a consultation early when considering data transfer, with or without privacy preserving methods, so that appropriate budgeting can be incorporated into your grant or contract.


Data Use Agreements

Do I need to know how to write a data use agreement?

No. NC TraCS has templates for data use agreements for CDWH data requests. If you are doing a project that requires data sharing, we will send you our template for you to share with the collaborating institutions.

If you receive a DUA from your collaborator, please share that with the CDW analyst working on your request.

Who signs the data use agreement?

Data use agreements are not signed by the investigators, but by a UNC official with appropriate delegated signature authority. For Carolina Data Warehouse for Health requests, a member of the CDW team will obtain a signature from a UNC Health Care representative. For most other agreements, submit a request through ALICE (apps.research.unc.edu/alice) to obtain a signature.

Who do I contact if I need to execute a data use agreement?

  • When requesting to share data obtained through the CDWH, please works with the TraCS analyst/project management team to prepare your DUA. The TraCS team can provide DUA templates and send existing DUAs for review
  • When planning to share data not obtained through the CDWH, contact the Industry Contracting team or submit your DUA through ALICE
  • If you need a DUA to cover receipt of data from another site, contact the Industry Contracting team or submit your DUA through ALICE

When in doubt, please ask a member of the TraCS analyst/project management team.

NC TraCS Institute logo vertical

In partnership with:

Contact Us


Brinkhous-Bullitt, 2nd floor
160 N. Medical Drive
Chapel Hill, NC 27599

919.966.6022
This email address is being protected from spambots. You need JavaScript enabled to view it.

Social


Cite Us


CitE and SUBMit CTSA Grant number - UM1TR004406

© 2008-2024 The North Carolina Translational and Clinical Sciences (NC TraCS) Institute at The University of North Carolina at Chapel Hill
The content of this website is solely the responsibility of the University of North Carolina at Chapel Hill and does not necessarily represent the official views of the NIH   accessibility | contact