Skip to main content

 

 

NCI Header
The Cancer Genome Atlas Sign up for updates
Search GO

 

 


Expanded Program
  More information

Scientific Publications

View a growing list of scientific publications using TCGA Data.


Stay Connected
Sign up for TCGA Updates.


National Cancer Institute National Human Genome Research Institute

| The Responsible Use and Publication of Data Generated by the TCGA Pilot Project

The primary purpose of the TCGA, as described on the program’s Web site (cancergenome.nih.gov) and in recent papers on the program (Collins and Barker, 2007), is to develop and publish a comprehensive catalog of the genomic changes found in individual cancer types. As is now standard practice in large-scale genomic research projects, the TCGA pilot project will adopt and follow a policy of releasing data as quickly as possible prior to publication, anticipating that they will be useful to many investigators. The TCGA pilot project anticipates that its data will be of high value in a number of research areas and will be used in many ways. Those include but are not limited to development of new analytical methods, identification of the genomic etiology of individual tumor types and subtypes, and development of new experimental diagnostic, therapeutic and preventive approaches and strategies for cancer. Thus, the TCGA pilot project recognizes that the data should be available to all users for any purpose, limited only by the need to avoid identifiability of the research participants (Lowrance and Collins, Science, August 3, 2007).

The NCI and NHGRI have identified the TCGA pilot project as a “community resource project... a research project specifically devised and implemented to create a set of data, reagents or other material whose primary utility will be as a resource for the broad scientific community.” This concept was developed at a meeting that was held to discuss the release of pre-publication data from large resource-generating scientific projects. That meeting, the “Fort Lauderdale meeting,” was held in January 2003 and was sponsored by the Wellcome Trust and the NHGRI, one of the TCGA’s funders. The report from that meeting is at http://www.genome.gov/Pages/Research/WellcomeReport0303.pdf.

The recommendations from the Fort Lauderdale meeting address the roles and responsibilities of data producers, data users and funders of community resource projects, with the aim of establishing and maintaining an appropriate balance between the interests that data users have in rapid access to data and the needs that data producers have to publish and receive recognition for their work. The conclusion of the attendees at the Fort Lauderdale meeting was that a “responsible use” approach would be the best way to ensure that first-rate data producers will continue to participate in such projects and produce and quickly release large-scale data sets of broad use to a wide range of investigators. “Responsible use” was defined as allowing the data producers to have the opportunity to publish the initial global analyses of the data, as specifically articulated at the outset of the project, within a reasonable period of time.

The TCGA pilot project currently plans to prepare several manuscripts based on TCGA data:

  • Commentary detailing the scientific aims and organization of TCGA
  • Interim integrated microarray data analysis of GBM partial set
  • Analysis of DNA sequencing data for the GBM sample set
  • Final integrated microarray data analysis and sequence data analysis of completed GBM set
  • Ovary and Lung reports as in 2-4 but about one year behind above timeline

To act in accord with the Fort Lauderdale principles and support the continued prompt public release of large-scale genomic data prior to publication, researchers who plan to prepare manuscripts that would be comparable to the analyses described above, and journal editors who receive such manuscripts, are encouraged to coordinate their independent reports with the TCGA pilot project’s publication schedule described above. This may be done by contacting the TCGA Project Team at tcga@mail.nih.gov.

Beyond the topics described above, researchers are free, and indeed encouraged, to publish results based on integrating TCGA data with data from other sources, particularly in efforts to study the role of specific genes and genomic changes in the biology of cancer. Researchers also are encouraged to use TCGA data to publish on the development of novel methods to analyze genomic data related to cancer and genotype-phenotype relationships in cancer. This may include the application of these methods to portions of the data, for example specific cancer subtypes or particular aspects of tumor biology.

The NCI and NHGRI do not consider that deposition of data from the TCGA pilot project, like those from other large-scale genomic projects, into its own (http://cancergenome.nih.gov/dataportal/data/about/) or public databases to be the equivalent of publication in a peer-reviewed journal. Therefore, although the data are available to others, the producers still consider them to be formally unpublished and expect that the data will be used in accord with standard scientific etiquette and practices concerning unpublished data.

Prior to the publication of the initial paper, the TCGA pilot project requests that authors who use data from the TCGA pilot project acknowledge the TCGA pilot project.

Authors are also encouraged to acknowledge the appropriate sample donors and research groups. Similarly, the TCGA pilot project requests that Journal editors and reviewers attempt to ensure that the TCGA pilot project is cited and that appropriate acknowledgements are made.

To ensure protection of genetic privacy for sample donors, data users will have to agree to certain conditions described in the TCGA Patient Protection Policy and Controlled Access Policy as to how the data will be used. For example, users will have to agree that they will share these data only with others who have also completed a data access agreement and that they will not patent discoveries in a way that prevents others from using the data (refer to IP policy ). This means that reviewers of a manuscript who need to see any controlled-access TCGA data underlying a result must also agree to these user access conditions before they can see these data.

Meeting presentations of TCGA data and analyses are possible and encouraged. However, to keep track of meeting presentations, and to avoid potential similar and/or identical presentations of the same data at a single meeting, we request that each presenter submit their abstract to the TCGA Project Team two weeks before the abstract is due. If duplicate meeting presentations occur, you will be contacted by the Project Team, which will suggest how to divide the presentations to minimize overlap. In addition, public meeting oral presentations of the data are also allowed and encouraged, but each investigator is asked to keep track of when and where these presentations occurred. The TCGA Project Team will provide to each investigator a series of two-three slides that must be displayed on all posters, or shown as part of an oral presentation, which will accomplish the goal of properly citing the TCGA pilot project and its many contributors; it is critical that TCGA also be properly cited and identified in the meeting abstracts, and language will also be provided to accomplish this goal.

National Cancer InstituteNational Human Genome Research InstituteNational Institutes of HealthDepartment of Health and Human ServicesFirstGov.gov