

Over the past decade, Genome Wide Association Studies (GWAS) have provided insights into how genetic variations contribute to human diseases. The chromatin states for mouse epigenomes of 12 tissue types at 8 different developmental timepoints, constituting 66 epigenomes of 8 histone marks each, have been annotated using a model integrating 1,056 genomic datasets and their respective controls. The chromatin states of 164 human cell types have been annotated using this strategy by integrating 1,615 genomics datasets ( Libbrecht et al., (2019) Genome Biology). The procedure is "semi-automated" because states are then manually compared with known biological information in order to designate each state as an enhancer-like, promoter-like, gene body, etc. Semi-automated genomic annotation methods such as ChromHMM and Segway take as input a panel of epigenomic data (including histone mark ChIP-seq and DNase-seq) in a particular cell type and use machine learning methods to simultaneously partition the genome into segments and assign chromatin states to these segments the states are assigned such that two segments with the same state exhibit similar epigenomic patterns. SCREEN also presents the results of using cCREs to interpret the variants uncovered by Genome-wide Association Studies (GWAS). SCREEN allows users to explore cCREs and investigate how they connect with other annotations in the Encyclopedia in a cell-type-specific manner, as well as the underlying raw ENCODE data whenever available. SCREEN is a web-based search and visualization engine specifically designed for the Registry of cCREs. Currently 25 human (15 mouse) cell types have complete cell-type-specific cCRE classifications and 839 human (157 mouse) cell types have partial cCRE classifications. For each specific cell type, we also classified cCREs into these groups using DNase, H3K4me3, H3K27ac, and CTCF data specific for that cell type. Using H3K4me3, H3K27ac, and CTCF signals across across a large number of cell types, we classified cCREs into promoter-like, enhancer-like, DNase-H3K4me3, and CTCF-only groups in a cell-type agnostic manner. Currently the Registry (version 2) comprises 926,535 human cREs and 339,815 mouse cCREs. The cCREs in the Registry are the subset of representative DNase hypersensitivity sites (rDHSs) that are supported by these two histone modifications and CTCF-binding data.


The core of the integrative level of the ENCODE Encyclopedia is the Registry of candidate cis-Regulatory Elements (cCREs), which integrates all high-quality DNase-seq and H3K4me3, H3K27ac, and CTCF ChIP-seq data produced by the ENCODE and Roadmap Epigenomics Consortia. Integrative Level Annotations The Registry of Candidate cis-Regulatory Elements
