- NIH Roadmap Epigenomics Project. The NIH Roadmap Epigenomics Mapping Program was launched in 2008 with the goal of producing a public resource of human epigenomic.
- Roadmap Epigenomics Project. The overall hypothesis of the NIH Roadmap Epigenomics Program is that the.
- The NIH Roadmap Epigenomics Program Kim McAllister [email protected] Program Administrator Genes and Environment Health Branch National Institute of.
The NIH Roadmap Epigenomics Mapping Consortium aims to produce a public resource of epigenomic maps for stem cells and primary ex vivo tissues selected to represent. Kellis leads NIH Roadmap Epigenomics Consortium to map.
The NIH Roadmap Epigenomics Program data resource - Europe PMC Article. Abstract. The NIH Roadmap Reference Epigenome Mapping Consortium is developing a community resource of genome- wide epigenetic maps in a broad range of human primary cells and tissues. There are large amounts of data already available, and a number of different options for viewing and analyzing the data. This report will describe key features of the websites where users will find data, protocols and analysis tools developed by the consortium, and provide a perspective on how this unique resource will facilitate and inform human disease research, both immediately and in the future. Keywords: chromatin, data resource, data visualization, DNA methylation, epigenetic mapping, epigenetics, epigenomics, histone modification, human disease. The completion of the Human Genome Project marked a significant milestone, one which paved the way for annotation of the full catalog of human genes. This was undeniably a huge step forward for human disease research.
The sequence of a gene, however, only provides some insight into its function. Given that each of our cells possesses an identical complement of genes, what differentiates a skin cell from a heart muscle cell from a neuron? Genes must be turned on, off or become expressed at different levels to effect the changes leading to the functional differences between cell types. Therefore, it is equally important to understand how these genes are regulated – when, where and how is a given gene expressed? Epigenetic mechanisms, such as DNA methylation and a variety of post- translational histone modifications, play an important role in establishing gene- expression programs, as well as in maintaining them, as cells divide. The NIH Roadmap Epigenomics Program .
This multicomponent program funds research in several relevant areas, including technology development in epigenetics and epigenetic imaging, discovery and characterization of novel epigenetic marks, and investigation of how epigenetic signatures are disrupted in human disease. One key goal of this program is to gain a better understanding of the normal pattern of epigenetic modification, which will allow for comparisons between different tissues and cell types, and will serve as a reference for comparison to diseased samples.
Recent advances in sequencing technology have made it possible to move beyond gene- by- gene analyses, allowing for truly unbiased, genome- wide mapping of epigenetic modifications. The NIH Roadmap Reference Epigenome Mapping Consortium, a group comprised of four Reference Epigenome Mapping Centers and an Epigenomics Data Analysis and Coordination Center, has been charged with generating these genome- wide epigenomic maps and assembling them into a publicly available data resource (Table 1) . As is true with any epigenetic study, a number of considerations are involved when selecting samples for mapping. Each of the specific cell types that make up a tissue probably have different epigenomic profiles. However, it can often be nearly impossible to isolate enough material of a particular purified cell type for analysis. The consortium has made an effort to achieve balance by covering a wide range of disease- relevant tissues, while including more highly purified cell types when possible.
Currently, a wide range of adult and fetal cells and tissues are represented, including cells from a number of distinct brain regions and a variety of purified blood cell types. In addition, several pluripotent cell lines are included, such as induced pluripotent stem cells, human embryonic stem cells, as well as some differentiated forms of these cells. Currently, over 1. Table 1). Specific epigenetic modifications can often be associated with a particular function; for example, H3.
K9me. 3 is generally found in repressed regions of the genome, while H3. K9ac is generally correlated with gene activation. However, simply determining the distribution of one mark is not sufficient, as the function of a given mark may vary depending upon the broader chromatin context in which it resides. Furthermore, these marks must be correlated with a functional outcome, such as altered gene expression. A key strength of this data resource is the fact that for each cell and tissue type represented, multiple features will be ultimately be mapped, including DNA methylation, post- translational histone modifications, chromatin accessibility and RNA. DNA methylation data will be made available for all cell and tissue types represented. Detailed protocols and standards for each of these analyses have been made available to the community online .
These include reduced representation bisulfite sequencing . More recently, the consortium has moved towards using Methyl.
C- seq . As methods for high- throughput analysis of 5- hydroxymethylcytosine are being developed, this feature may also be added to a subset of cell types in the future. Two approaches are used for the analysis of histone modifications by chromatin immunoprecipitation with sequencing. A small number of high- value samples, including several embryonic stem cell lines and their differentiated forms, will be analyzed to significant depth, with a large panel of histone modifications. Currently, there are approximately 3.
The data gained from these more comprehensive analyses are used to inform the selection of a more limited panel of histone modifications, which is applied to the majority of samples being analyzed. Currently, this panel includes H3. K2. 7me. 3, H3. K3.
H3. K4me. 1, H3. K4me. H3. K9me. 3 and H3.
K2. 7ac. These are the modifications found to be the most informative, namely the ones that are most difficult to predict based on other modifications . In addition to DNA methylation and histone modifications, most samples will undergo DNase I hypersensitivity mapping . Finally, each sample will be analyzed for RNA content. In many cases, this will be accomplished with expression arrays, but the consortium is moving towards using RNA- sequencing (RNA- seq) . RNA- seq offers the most comprehensive view of RNA expression. It includes small RNAs, provides a measurement of alternative splicing events and enables allelic analyses to be carried out.
The most current standards in use for chromatin immunoprecipitation with sequencing, whole- genome bisulfite sequencing, and RNA- seq can be found at the Consortium homepage, under `Protocols and Data Standards' . These are summarized in Table 1 and described below. Reference Epigenome Mapping Consortium homepage. The centerpiece of the program is the Reference Epigenome Mapping Consortium homepage . A list of all consortium publications can also be found here.
Tabs located at the top of the page facilitate navigation of the site. This site is continually evolving in an effort to maximize the user experience and facilitate use of the resource by the community. The NIH Roadmap Epigenomic Mapping Consortium homepage. This site also features an easy- to- use interface to browse consortium data.
Clicking on the `Data' tab will open a matrix- style data browser. Cell and tissue types are grouped into anatomic categories for easy navigation. Available data types are indicated as shaded squares. Clicking on any of these squares will open a track selection window on a consortium- hosted University of California- Santa Cruz (UCSC; CA, USA) Genome Browser Mirror site . Once tracks of interest are selected, the user must change the `maximum display mode' to full, using the drop down menu, and click `submit'.
Users can also choose to view all available data for a particular epigenetic feature across all cell types by using the drop down menu found at the top of the matrix browser (`select assay to view cell/tissue data'). This site also offers a unique visual data browser, which displays cell types with available data in an anatomical context.
The consortium's goal is to provide a complete data set for each cell type analyzed. As described earlier, this data set – referred to as a `complete epigenome' – would contain DNA methylation data, RNA expression data, a panel of histone modifications and DNase I hypersensitivity profiles where possible. Definitions of the various classes of complete epigenomes used by the consortium can be found on this site by clicking the `Complete Epigenomes' tab at the top of the page. This page will also be updated with a list of all cell/tissue types that have reached completed status. National Center for Biotechnology Information.
The National Center for Biotechnology Information (NCBI) serves as the long- term archive for data produced by the Reference Epigenome Mapping Consortium, as well as for the other epigenomics projects funded by the Roadmap Epigenomics Program. Rapid data release is an important goal of the consortium. While consortium data can be viewed on several websites, most are updated only after a data freeze, which occurs several times a year. By contrast, the two NCBI sites described below are updated continuously, providing a real- time picture of the data available. Users interested in simply downloading data files for ana lysis with their own tools should begin at the NIH Roadmap Epigenomics page of the Gene Expression Omnibus (GEO) .
This site offers several options for navigating the data: listed by sample (a particular feature in a particular cell type), using a matrix- style browser, or by search terms; however, only the sample list is updated continuously. Clicking on any accession number will open the metadata associated with the sample, where the user will find specific information about the individual the sample was derived from, the experimental conditions, and data generation and processing. The data can be downloaded in several of the most commonly used file formats (e. Short Read Archive). NCBI has also developed an epigenomics portal .
Data can be browsed either by experiment (i. Here, data files can be downloaded or viewed on a genome browser.