HOME  |   AMDeC |   Columbia Genome Center |   Contact Us|
About The Center
Introduction
What you can do
Using the Facility
Hardware
Software
Databases
Staff
Services for Users
Access
Manuals
Support
Registration
Resources
caWorkBench3.0
Algorithm Reference
Tutorials & Examples
Links
Maps & Directions
Contact Us
 

 

Sequence Database Directory Layout

 

Detailed Directory Layout - BlastMachine and Linux Fileserver

The data layout is identical on the Linux fileserver and the BlastMachine. However, some databases may only be present on a particular platform as appropriate. This is an overview only, a more detailed representation showing the actual filenames you need to perform a specific search is given here.

Database Splitting

Some genome databases have sequences larger than the BlastMachine can handle (depending on query sequence size). For these databases, a version is prepared with the sequence split into 100K fragments with 10K overlaps. "unsplit/" and "100/" directories are made for the unsplit and 100K/10K split sequences, respectively.

Directory structure last updated: 12/02/2004:

  • ncbi/
  • embl/
  • tigr/
  • UniGene/
  • user/
  • projects/
  • genomes/
    • anopheles/
      • unsplit/
      • 100/
    • bacterial/
    • c_elegans/
    • ciona/
    • drosophila
    • fugu/dgi/
    • human/
      • HumanRepeats
      • goldenPath_May2004/ (hg17)
        • 100/
    • mouse/
      • goldenPath_Feb2003/ (mm3)
        • 100/
      • ensembl/
        • MGSC_2002April11_V3
      • ncbi/
        • reads/ (June 2002)
    • rat/ (GoldenPath June 2003 - rn3)
    • tetraodon/
      • 2002_May/
      • reads/ (October 2002)
    • virus/
      • sars/ (GoldenPath April 2003 - sc1)
    • yeast/
    • zebra_fish/
      • unsplit/
      • 100/
      •  

 

Directory Layout - GeneMatcher2

The standard databases are accessed via dbsets, which hide the actual location and component data files of the databases. We maintain the dbsets. However, there are directories that users may need to directly access. These are the user and project directories, where users may place and use their own databases via the btk command. The paths to these directories are:

/fdf/genematcher/gm0/0/projects/

/fdf/genematcher/gm0/1/user/

The other top level directories are

/fdf/genematcher/gm0/0/genomes/

/fdf/genematcher/gm0/1/embl/
/fdf/genematcher/gm0/1/ncbi/
/fdf/genematcher/gm0/1/tigr/

Please contact us if you need to have a project or user directory created.

 

 

 

 

 

 
Suggestions & Problems? Send e-mail to the Webmaster