HOME  |   AMDeC |   Columbia Genome Center |   Contact Us|
About The Center
Introduction
What you can do
Using the Facility
Hardware
Software
Databases
Staff
Services for Users
Access
Manuals
Support
Registration
Resources
caWorkBench3.0
Algorithm Reference
Tutorials & Examples
Links
Maps & Directions
Contact Us
 

 

GeneMatcher2 - Working with Databases, Queries and Sets

 

Updates to Facility Supported Public Databases

The Core Facility staff maintains a number of databases on the GeneMatcher2. Some of these are updated weekly, others as they are released (monthly, quarterly etc). Updates typically occur over the weekend. Previous copies of data updated on a weekly schedule are not retained. At least one previous version of human genome assemblies will usually be maintained.

Requesting New Databases

If a publicly available database you need is not loaded on the GeneMatcher2, you may contact the Core Facility staff and request that it be loaded. The staff will verify that the database does not already exist on the machine, and load it into the proper directory. The staff can also load any custom database you may create.

Policies on Loading Your Own Databases onto the GeneMatcher2

Some investigators may wish to load database or query files onto the GeneMatcher2 themselves. Due to disk space limitations, please make sure that the database you wish to load does not already exist on the GeneMatcher2. Check the list of installed databases, or ask the Core Facility staff to check for you. At the conclusion of a project, user databases should be deleted from the GeneMatcher2. The staff can provide assistance as needed.

Removal of User Databases

User databases believed to be no longer needed may be removed by the Core Facility staff. In general. six weeks advance notice will be provided by email or other means, and the database will be deleted if no response is received in that time.

Procedures for Loading Databases onto the GeneMatcher2

If you wish to load your own database or query files directly onto the GeneMatcher2, please first contact the support staff to have a data directory created for you on the machine. Then follow the procedures below to load the files.

Database and query files can be loaded onto the GeneMatcher2 either using a web interface or from the command line. The web interface is easier to use, but can only be used to load files of less than 100 MB. For loading larger files or to automate an operation, the "btk" command can be used from any of the AMDeC Core Facility UNIX hosts.

Multiple database files can be associated together in Database Sets, and multiple query files can be associated together into Query Sets. Database Sets and Query Sets have several important features in common and some differences from each other:

  1. Database Sets and Query Sets can be composed of any number of data files, even just one. However, when using the web interface, only a single query file can be used to create a Query Set. For more complex Query Sets, the command line "btk" interface must be used.
  2. Simple, globally available names can be given to Database Sets or Query Sets such as "nt" or "Pfam8.0". Please don't use an already existing name!
  3. By using the named data sets, you avoid having to specify a path to the actual database or query files when you run a search (paths on the GeneMatcher2 are long and complex!).
  4. Database Sets and Query Sets are visible in the search menus of the BioView Workbench web interface. Single database and query files are not visible there.

Detailed Instructions:

Loading Databases and Database Sets from the Web (includes Query Sets)

Loading Databases and Database Sets from the UNIX command line. (instructions for Queries and Query Sets are analogous).

 

 

 

 

 
Suggestions & Problems? Send e-mail to the Webmaster