ENCODE Project: Comparative Sequencing

The ENCODE Project is an NHGRI-led initiative that aims to identify all the functional elements in the human genome. Its initial effort is a pilot-scale program that is focusing on 1% of the human genome, distributed across 44 discrete regions. A major component of the ENCODE Project is the comparative sequencing of these regions in multiple vertebrate species. In partnership with the Genome Technology Branch of NHGRI, NISC is extensively involved in generating these multi-species sequences and analyzing the resulting data. This and subsequent pages provide a summary of the progress to date in mapping and sequencing the 44 ENCODE regions in various species. For further information, contact NISC@nhgri.nih.gov.  For information about genome-wide DNA sequencing efforts, see the International Sequencing Consortium web page http://www.intlgenome.org.
 
Overall progress in sequencing ENCODE targets in various species, listed in general anticipated order (species-by-species) for data generation:

Species Total Sequenced BACs Multi-BAC Assembled Targets Current BAC Gaps in Assembly % Human Sequence Represented Species Progress
by Target
Galago
(small-eared)
285/285 (100%) 44/44  11  96%  Data File 
Baboon
(olive)
302/302 (100%) 44/44  97%  Data File 
Marmoset
(white-tufted-ear)
275/275 (100%) 44/44  99%  Data File 
Armadillo
(nine-banded)
386/386 (100%) 44/44  24  94%  Data File 
Platypus 230/236 (97%) 41/42  12  80%  Data File 
Bat
(greater horseshoe)
259/260 (100%) 44/44  98%  Data File 
Elephant
(African)
397/398 (100%) 44/44  19  96%  Data File 
Hedgehog
(middle-African)
313/322 (97%) 42/44  22  82%  Data File 
Rabbit 234/234 (100%) 43/44  91%  Data File 
Shrew
(European)
290/296 (98%) 44/44  91%  Data File 
Dusky titi 232/236 (98%) 44/44  97%  Data File 
Monkey
(owl)
234/236 (99%) 44/44  98%  Data File 
Monkey
(colobus)
213/216 (99%) 44/44  97%  Data File 
Lemur
(gray mouse)
181/188 (96%) 44/44  95%  Data File 
Guinea Pig 251/263 (95%) 44/44  12  95%  Data File 
Cat 282/284 (99%) 44/44  97%  Data File 
Tenrec
(lesser hedgehog)
262/263 (100%) 44/44  13  94%  Data File 
Bat
(little brown)
189/198 (95%) 44/44  15  89%  Data File 
Squirrel
(13-lined ground)
247/254 (97%) 44/44  10  94%  Data File 
Gibbon
(northern white-cheeked)
227/229 (99%) 44/44  98%  Data File 
Monkey
(Bolivian squirrel)
193/197 (98%) 44/44  94%  Data File 
Monkey
(African green)
243/249 (98%) 44/44  97%  Data File 
Horse 188/205 (92%) 43/43  90%  Data File 
Flying Fox
(Malayan)
189/193 (98%) 43/44  87%  Data File 
Rock hyrax 218/219 (100%) 44/44  14  84%  Data File 
Shrew
(northern tree)
246/252 (98%) 43/43  33  86%  Data File 
Opossum
(gray short-tailed)
346/355 (97%) 41/43  15  76%  Data File 
Ferret
(domestic)
145/159 (91%) 31/39  18  54%  Data File 
Sloth
(two-toed)
300/318 (94%) 41/43  39  71%  Data File 
Alpaca 151/162 (93%) 44/44  92%  Data File 
Pika
(American)
134/200 (67%) 20/44  38%  Data File 
Download data file for all species 

 
  Target-by-Target, Species-by-Species Sequencing Summary
    Summary of BAC Maps
    Summary of BAC Sequencing
    Summary of Multi-BAC Assemblies
 
  NHGRI Homepage for ENCODE project
  UCSC Homepage for ENCODE project

 

 

 

 

 


NISC Bioinformatics Team
Last modified: Fri Jan 14 11:51:55 EST 2005