Genome Wide Association Studies (GWAS)


The NIH GWAS policy facilitates the sharing of large datasets containing coded, de-identified genotypic and phenotypic data obtained from NIH supported or conducted research. The NIH GWAS policy applies to data obtained prospectively as well as retrospectively from existing specimens. A key element of the NIH GWAS policy is the expectation that data from NIH-supported GWAS will be deposited into the NIH GWAS data repository, currently designated as the database of Genotypes and Phenotypes (dbGaP), at the National Center for Biotechnology Information (NCBI), a component of the National Library of Medicine (NLM). The data submitted for inclusion in the NIH GWAS data repository will be coded and de-identified by the submitting investigator, but the investigator may retain the key to the code that would link to specific individuals.

Key Terms

Coded: Coded means that any identifying information (such as name or social security number) that would enable the investigator to readily ascertain the identity of the individual to whom the private information or specimens pertain has been replaced with a number, letter, symbol, or combination thereof (i.e., the code); and a key to decipher the code exists, enabling linkage of the identifying information to the private information or specimens.

De-identified De-identified means that the identities of data subjects cannot be readily ascertained or otherwise associated with the data by the repository staff or secondary data users (45 CFR 46.102(f)), the 18 identifiers enumerated at section 164.514(b)(2) of the HIPAA Privacy Rule are removed and the submitting institution has no actual knowledge that the remaining information could be used alone or in combination with other information to identify the subject of the data.


Submission – In addition to the requirements outlined in the policy for Initial Review, investigators should include a data sharing plan that describes how the NIH GWAS policy will be met. In the data sharing plan, investigators should outline plans to de-identify the data according to the following criteria:

  1. a random, unique code should be assigned to the data;
  2. the identifiers outlined in the HIPAA Privacy Rule should be removed prior to the submission; and
  3. the identities of subjects should not be readily ascertained or otherwise associated with the data by the repository staff or secondary data users.

Institutional Certification – The Institutional official is responsible for certifying that data submission plans meet the following expectations defined in the GWAS policy:

  1. The data submission is consistent with all applicable laws and regulations as well as institutional policies;
  2. The appropriate research uses of the data and the uses that are specifically excluded by the informed consent documents are delineated;
  3. The identities of research participants will not be disclosed to the NIH GWAS data repository.

CPHS Review: When NIH funded research involves GWAS, CPHS is responsible for reviewing and verifying that:

  1. The submission of data to the NIH GWAS data repository and subsequent sharing for research purposes are consistent with the informed consent of study participants from whom the data were obtained;
  2. The investigator’s plan for de-identifying datasets is consistent with the standards outlined in the policy;
  3. It has considered the risks to individuals, their families, and groups or populations associated with data submitted to the NIH GWAS data repository; and
  4. The genotype and phenotype data to be submitted were collected in a manner consistent with 45 C.F.R. Part 46.

Certificates of Confidentiality. Prior to submitting GWAS data to the NIH GWAS data repository, investigators should determine whether a Certificate of Confidentiality has been obtained for their research or, if one has not been obtained, to consider whether or not it would be appropriate to do so. CPHS may determine that a Certificate of Confidentiality is required for a particular study, upon review.

Informed Consent – In addition to the elements of disclosure for genetic studies, CPHS will ensure that the consent document addresses issues specific to GWAS including purpose, risks, benefits, withdrawal and return of results, as outlined in the GWAS consent suggested language.

Retrospective Studies. For retrospective studies performed using existing genetic materials and previously collected data, CPHS shall review the consent document under which existing genetic materials and data were obtained to determine if the information addresses risks and data sharing of genotypic and phenotypic data.

For studies that propose to use pre-existing data or samples, CPHS may conclude in some cases that the original consent is not adequate for submission to the GWAS data repository and subsequent sharing for research. In these cases, CPHS may decide that it is appropriate and necessary for the investigator to seek explicit consent of the research participants for submission to the NIH GWAS repository and subsequent sharing.

CPHS may determine that re-consent is not feasible or appropriate for a given study or that it cannot verify that the other criteria described above have been met for submission to the NIH GWAS repository. In these cases, the CPHS and Institution may disapprove the request for data sharing with GWAS.

In all cases, CPHS staff will send written communication of the outcome to the Principal Investigator.


Genome Wide Association Studies (GWAS)

