HOME  |   AMDeC |   Columbia Genome Center |   Contact Us|
About The Center
Introduction
What you can do
Using the Facility
Hardware
Software
Databases
Staff
Services for Users
Access
Manuals
Support
Registration
Resources
caWorkBench3.0
Algorithm Reference
Tutorials & Examples
Links
Maps & Directions
Contact Us
 

 

Examples of Recent and Ongoing Projects at the AMDeC Bioinformatics Core Facility

1. EST Screening

A group used the Paracel TranscriptAssembler (PTA) software to cluster and assemble 35,000 sequences from more than 20,000 cDNA EST clones, producing about 11,000 non-redundant sequences. These sequences were then screened using BLASTN and TBLASTX on the BlastMachine against the nr (non-redundant protein) and nt (nucleotide) databases. The most informative results have come from TBLASTX searches against the nt database.


2. Mouse Genomic-Reads Screening for SNP Identification

Dr. Stuart Fischer of the Facility staff is working with a group to identify particular genes of interest in 4 straints of congenic mice using 1,000 transcripts in seven genetically defined intervals. To identify strain-specific DNA sequence variants, they BLAST the transcripts against the NCBI mouse genomic reads database (>41 million reads) and other sources to select genomic sequences corresponding to the complete transcript set. Using the BLASTN algorithm, this job takes 15 hours. The genomic sequences are re-assembled from the fragments using PGA and strain-specific differences revealed by Calypso and in-house software (Calypso is a component of the Paracel PGA package). Amino acid substitutions which effect potentially important structural changes are identified using the PrISM software package developed by Dr. An-Suei Yang of the CGC.

3. Legionella Genome Annotation

Dr. James Russo's lab in the CGC is sequencing the genome of the Legionella bacterium. Every few weeks all contigs and unassembled sequences are searched against various NCBI and local databases. Such a run can be completed overnight on the BlastMachine.

4. Epilepsy Gene Evolutionary Analysis

Dr. Pavel Morozov of the Facility staff worked on a collaboration with Dr. Ruth Ottman, and Drs. Conrad Gilliam and Sergey Kalachikov of the Columbia Genome Center to characterize a newly discovered gene family (LGI), one member of which (LGI1) causes a rare form of epilepsy. The BlastMachine and GeneMatcher2 (Smith-Waterman, HMMER) were used intensively to search for distant homologs. Also, comparison of transcribed sequences from genomic regions of about 10 Mbases around the LGI family members was performed using the BlastMachine.


5. Bacterial Enzyme Family Screening

A group developed a system based on extensive HMM searches (using HMMER on the GeneMatcher2) to search for potential RNA-related enzyme family members in a bacterial genome. They are performing iterative HMM searches for many related enzymatic families and RNA binding domains and other related sequences to identify family members. Those sequences will then be tested experimentally for the expected enzymatic activity. This entails thousands of HMM searches.

6. Anopheles gambia (mosquito) Genome Analysis

Dr. Andrey Rzhetsky of the Columbia Genome Center performed an analysis of four gene families and their exon/intron structure in the newly sequenced genome of Anopheles gambiae using the HMMER and Genewise algorithms on the GeneMatcher2. Gene families studied were odorant receptors, serpins, gram-negative bacteria binding proteins (immunity), and ABC transporters.

7. Discovery of Genes Involved in Dermatological Disorders

Dr. Fischer is working with a group with the goal of discovery of genes involved in dermatological disorders. To identify candidate genes associated with a unique form of a particular syndrome, they have developed a software package with a graphical interface for comparing all known and presumptive genes in an interval on human chromosome 8q.

 

 
Suggestions & Problems? Send e-mail to the Webmaster