This directory contains the two sequences that should be used to align ENCODE sequence data to GRCh37/hg19 reference genomes. There is one sequence for female DNA, and the other for male DNA. The only difference between the sequences is that the male sequence includes chrY data with the PAR regions hard-masked (with N's). These sequences are composed of all the autosomes plus the X chromosome from GRCh37. The male sequence also includes the chrY sequence from GRCh37. None of the random chromosomes, chrUn chromosomes, or haplotype chromosomes are included in either reference sequence. The mitochondrial genome included in both references is the chrM sequence currently in use on the UCSC hg19 browser (NC_001807). This sequence is NOT considered to be the most representative of human. See this site for a discussion on what has changed between the old reference mitochondrial eequence and the new reference mitochondrial sequences. http://mitomap.org/bin/view/MITOMAP/HumanMitoSeq This directory also contains the ENCODE pilot regions lifted to hg19.
Name Last modified Size Description
Parent Directory - encodePilotRegions.hg19.bed 2011-03-07 08:39 1.3K female.hg19.2bit 2010-01-27 13:04 754M female.hg19.chrom.sizes 2010-03-10 11:45 362 female.hg19.fa.gz 2010-01-27 12:59 886M femaleByChrom/ 2010-04-09 16:18 - male.hg19.2bit 2010-01-27 13:03 768M male.hg19.chrom.sizes 2010-03-10 11:45 376 male.hg19.fa.gz 2010-01-27 12:56 893M maleByChrom/ 2010-02-04 18:48 - md5sum.txt 2010-08-26 11:48 361