HOME  |   AMDeC |   Columbia Genome Center |   Contact Us|
About The Center
Introduction
What you can do
Using the Facility
Hardware
Software
Databases
Staff
Services for Users
Access
Manuals
Support
Registration
Resources
caWorkBench3.0
Algorithm Reference
Tutorials & Examples
Links
Maps & Directions
Contact Us
 

 

Links to Bioinformatics-Related Sites

 

Meetings

New York Computational Biology Group - Bimonthly meetings at the New York Academy of Sciences.

Upcoming meetings: May 21, 2003; July 16, 2003, Sept. 10, 2003, Nov. 12, 2003.

New York Structural Biology Group - Meets at Rockefeller University.

Manhattan College Spring 2003 Lecture Series: The Bioinformatic Analysis of Genes and their Products: From Primary Sequence to Protein Function. Open to the public.

 

Organizations

Bioinformatics.org (software page) - Specialized databases, software, news.

ISCB - International Society for Computational Biology.

Open Bioinformatics Foundation - umbrella group for various bio*.org projects.

 

Journals

ICSB list of Bioinformatics Journals

 

Bioinformatics Course Materials (External Sites)

ch.EMBnet.org Introduction to Bioninformatics 2003 - Course Materials (Swiss EMBnet node, Swiss Insitute of Bioinformatics).

Pittsburgh Supercomputer Center Biomedical Initiative - Sequence Analysis Tutorial.

Terry Speed's Lab - Slide presentations on linkage analysis, sequence analysis, gene expression.

 

Genome Databases and Browsers

Ensembl - Human, Mouse, Rat, Zebrafish, Fugu, Mosquito, Fruit fly, C. elegans, C. briggsae.

EnsMart - (EBI) - "EnsMart is a powerful data mining toolset for retrieving customised data sets from annotated genomes, using criteria from Ensembl and third party databases".

"Integrates gene and protein annotation,disease information, expression data, sequence variation and cross-species analyses. Provides homology, SNPs affecting proteins, retrieval by external identifiers, retrieval by expression (controlled vocabulary), customised sequence datasets and microarray annotation tools".

NCBI Human Genome, Mouse Genome

NCBI - Other Genomes - Fruit fly, Malaria parasite, Microbial, Plants, Rat, Retroviruses, Zebrafish.

UCSC Genome Browser - Human, Mouse and Rat.

TIGR - Comprehensive Microbial Resource.

Wormbase - Caenorhabditis elegans (Other Model Organisms)

Rat Genome Database - includes comparative map of rat vs mouse vs human.

GALA - Genome Alignment and Annotation Database- Penn State University. This database incorporates information about genes, SNPs, alignments, disease association and gene expression levels. Data is from multiple sources such as GenBank, Ensembl, The Whitehead Institute, and The Human Genome Browser.

SGD - Saccharomyces Genome Database - Stanford University. Genome sequences of several species of yeast.

GO - Gene Ontology Consortium. - "The goal of the Gene OntologyTM Consortium is to produce a dynamic controlled vocabulary that can be applied to all organisms even as knowledge of gene and protein roles in cells is accumulating and changing."

GOA - Gene Ontology Annotation - the GO vocabulary applied to a non-redundant set of proteins described in the Swiss-Prot, TrEMBL and Ensembl databases....

 

Genome-Genome Comparison

Twinscan - informant based gene prediction (Washington University).
Description from the TWINSCAN website:


"TWINSCAN is a gene prediction system that models both gene structure and evolutionary conservation. The scores of features like splice sites and coding regions are modified using the patterns of divergence between the target genome and a closely related genome.....Currently, we believe that TWINSCAN is the best available gene prediction program for mammalian genomes and for Cryptococcus neoformans. For Arabidopsis TWINSCAN and FGENESH have similar performance, but TWINSCAN is slightly better at getting gene boundaries right. Other programs, such as GENSCAN and GeneMark.HMM do not perform as well in our tests. For C. elegans, TWINSCAN and GENEFINDER have roughly similar performance".

Vista - Visualization Tool for Alignments. VISTA is a set of tools for comparative genomics. It was designed to visualize long sequence alignments of DNA from two or more species with annotation information.

MUMmer - MUMmer is a system for aligning entire genomes extremely rapidly.

Pipmaker - Penn State Bioinformatics Group. PipMaker computes alignments of similar regions in two DNA sequences. The resulting alignments are summarized with a ``percent identity plot'', or ``pip'' for short. MultiPipMaker allows the user to see relationships among more than two sequences.

(Notes from the Genome Comparison section (using PipMaker) of the Cold Spring Harbor Workshop on Computational Genomics, November 2002).

Examining the Current Problems of Whole Genome Comparison: A Review by Patrick Chain, Stanford Univ.

ACT - ACT (Artemis Comparison Tool) is a DNA sequence comparison viewer based on Artemis. (Sanger Center).

Doublescan - Doublescan is a program for comparative ab initio prediction of protein coding genes in mouse and human DNA (Sanger Center).

 

 

 

More Databases

OMIM - Online Mendelian Inheritance in Man

PDB - Protein Data Bank

Allgenes.org - structured database of predicted genes in human and mouse.

HGVBase (Human Genome Variation Database) - Karolinska Institute - Curated mutation
and SNP database. Also, links to many other SNP databases.

BIND - Biomolecular Interaction Network Database.

Links to Pathway and other Databases

The SNP Consortium

GDB - The Genome Database

Genetic Association Database - "The Genetic Association Database is an archive of human genetic association studies of complex diseases and disorders. The goal of this database is to allow the user to rapidly identify medically relevant polymorphism from the large volume of polymorphism and mutational data, in the context of standardized nomenclature".


TRANSFAC - The Transcription Factor Database. Other databases at this site:

TRANSPATH - database on gene-regulatory pathways
CYTOMER - database of physiological systems, organs and cell types
S/MARtDB - collects information about scaffold/matrix attached regions and the nuclear matrix proteins that are supposed be involved in the interaction of these elements with the nuclear matrix. It covers the whole range from yeast to human.

 

Computational Resources

NCBI

TIGR

European Bioinformatics Institute - services page

Sanger Center
HGMP-RC
HGMP-RC Bioinformatics Services
Washington Univ. Genome Sequencing Center


EMBL Computational Services
SWISS-PROT/EBI

Swiss EMBnet (a node of the European Molecular Biology Network) -

Provides web interfaces to a number of programs. including the very useful T-Coffee multiple sequence alignment program, with color coding of similarity in the output.

 


IBM Bioinformatics and Pattern Discovery Group - Server hosts many novel algorithms.

GeneLynx - Portal provides single webpage summary of publicly available information about human and mouse genes.

SOURCE - Stanford Online Universal Resource for Clones and ESTs. "...pools publicly available data commonly sought after for any clone, GenBank accession number, or gene. SOURCE was specifically designed to facilitate analysis of the large data sets that biologists can now produce using genome-scale experimental approaches".

BCM Search Launcher - web based sequence manipulation utilities (format conversion, repeat masking, 6 frame translation etc.).

BiBiServe - Bielefeld University Bioinformatics Server
bioinformatik.de - lots of links....
bioinformatik.de - Genome Projects

CGAP-GAI - Cancer Gene Anatomy Project - Genetic Annotation Initiative - Cancer SNP resources.

 

Publicly Available SRS Servers (with large numbers of databases)

List of Public SRS servers
HGMP (Hinxton)
EBI (Hinxton)
Sanger Center (Hinxton)
DKFZ (Heidelberg)

 

Protein Signature and Domain Database Search Tools

InterProScan - Search InterPro. InterPro provides an integrated view of the commonly used signature databases (PROSITE, Pfam, PRINTS etc (see full list)), and has an intuitive interface for text- and sequence-based searches.

PROSITE - a database of protein families and domains. It consists of biologically significant sites, patterns and profiles that help to reliably identify to which known protein family (if any) a new sequence belongs.

Pfam - Search or browse the Pfam database. Pfam is a large collection of multiple sequence alignments and hidden Markov models covering many common protein domains and families. For each family in Pfam you can:

* Look at multiple alignments
* View protein domain architectures
* Examine species distribution
* Follow links to other databases
* View known protein structures

PRINTS - a compendium of protein fingerprints. A fingerprint is a group of conserved motifs used to characterise a protein family.

BLOCKS - Blocks are multiply aligned ungapped segments corresponding to the most highly conserved regions of proteins (Blocks Searcher).

ProDom - ProDom is a comprehensive set of protein domain families automatically generated from the SWISS-PROT and TrEMBL sequence databases

SMART - Simple Modular Architecture Research Tool (Protein domain search tool).

TIGRFAMS - TIGRFAMs are protein families based on Hidden Markov Models or HMMs.

ExPASy Molecular Biology Server - Expert Protein Analysis System (Swiss Inst. of Bioinformatics)

Superfamily - HMM library and genome assignments server - MRC LMB, Cambridge, UK

See also:

Swiss-Prot - The Swiss-Prot Protein Knowledgebase is a curated protein sequence database that provides a high level of annotation, a minimal level of redundancy and high level of integration with other databases.

TrEMBL - The TrEMBL database contains the translations of all coding sequences (CDS) present in the EMBL Nucleotide Sequence Database, which are not yet integrated into Swiss-Prot.

 

Protein Structure Prediction

PredictProtein - Secondary structure prediction, sequence analysis and fold recognition by prediction-based threading. At CUBIC (Burkhard Rost).

Jpred - Runs JNet algorithm for secondary structure prediction. Univ. of Dundee, U.K.

NPRS@ - Network Protein Sequence Analysis - Can run a number of prediction programs and display a consensus prediction. IBCP, Lyon, France. See also collection of algorithm descriptions.

Protein Structure Prediction - list of public links at ALMA Bioinformatica.

 

Microarray Databases

NCBI - Gene Expression Omnibus - Microarray data repository.

EBI - microarray resources (ArrayExpress etc).

 

Microarray Analysis

Terry Speed's Lab (Berkeley) - Statistical considerations in microarray analysis.

Pathway Analysis for Microarray Data

GenMAPP - Gene Microarray Pathway Profiler (UCSF)

 

PC-based Free Genetic and Sequence Analysis Programs

Bio-Edit - "a biological sequence alignment editor written for Windows 95/98/NT. A rich, intuitive multiple document interface with many convenient features makes alignment, manipulation and viewing of sequences relatively quick and easy on your desktop computer".

MEGA - Molecular Evolutionary Genetics Analysis. Powerful desktop tool for building phylogenetic trees and many other genetic data analysis tasks.

 

 

Selected Bioinformatics and Computational Biology Sites at Columbia University:

Columbia Genome Center (contains links to numerous faculty and staff)
C2B2 - Center for Computational Biology and Bioinformatics (courses)
Columbia University Bioinformatics Center (Headed by Dr. Burkhardt Rost)
Oncoinformatics Core at CPMC


Professor Dimitris Anastassiou (Electrical Engineering, CGC)
Dr. Harmen Bussemaker (Biology, CGC)
Dr. Andrea Califano (Dept. of Biomedical Informartics)
Professor Barry Honig (Biochem. and Biophysics, CGC)
Dr. Christina S. Lesli (Computer Science)
Dr. Paul Pavlidis (Medical Informatics, CGC)

Dr. Burkhardt Rost (Biochem and Biophysics, CU Bioinformatics Center)
Dr. Andrey Rzhetsky (Medical Informatics, CGC)
Dr. Chris Wiggins (Applied Math)
Dr. An-Suei Yang (Pharmacology, CGC) (PrISM modeling system)

Computational Biology Group
Center for Biomolecular Simulation

Peisen Zhang (CGC)- Gene Network Online - Contains information on protein-protein interactions collected from multiple sources. It can be searched to find possible pathways linking two target genes, or to find genes which may interact with the target gene. HDL project website with links to locally written SNP and haplotype analysis software.

 

New York Bioinformatics Sites and People (just a small sampling)

Albert Einstein College of Medicine

Lab of Dr. Mark Chance

Buffalo Center of Excellence in Bioinformatics

Lab of Jeff Skolnick

 

Cold Spring Harbor Laboratory

Lab of Dr. Lincoln Stein
Lab of Dr. Michael Zhang
Hazen Genome Sequencing Center

Memorial Sloan Kettering Cancer Center

Computational Biology Center

Mount Sinai School of Medicine

Dr. Fabien Campagne, Institute for Computational Biomedicine
Lab of Angel Ortiz
Lab of Dr. Roberto Sanchez

New York University

Lab of Bud Mishra (also Cold Spring Harbor)

New York University School of Medicine

Dr. Stuart Brown, RCR

Rockefeller University

Lab of Dr. Terry Gaasterland

SUNY Stony Brook

Lab of Dr. Moises Eisenberg
Lab of Ilya Vaksar Molecular Modeling and Structural Bioinformatics

Wadsworth Center - New York State Dept. of Health

Bioinformatics Institute - (with RPI)
Genomics Institute -

 

Weill-Cornell Medical Collge

Lab of Dr. Diana Murray
Computational Genomics Core Facility


Cornell University

Lab of Dr. Ron Elber at Cornell (Protein Modeling and Parallel Computation)
Computational Biology Program at Cornell
Computational Biology Tools at the Cornell Theory Center (NIH "Parallel Processing Resource for Biomedical Scientists")



Other

New York Structural Genomics Research ConsortiumNew York Biotechnology Association - a trade group.

Lab of Dr. Andrej Sali (Multiple structural databases and modeling programs).

 

AMDeC Related Sites

Center for Computational Research, University of Buffalo
NYSERNet - Columbia's connection to the Internet and Internet2
Internet2 - Members
Internet2 - Abilene Network (NYSERNet is on Abilene)
Internet2 - Abilene Network NOC - Maps and Participant lists

 

Resources

TIGR - software (Microarray, Assembler, Gene Finding, Annotation etc)

CMS-MBR - Large list of Bioinformatics links.

Northeast Structural Genomics Consortium

SNP Consortium Linkage Map Project

Phylogeny Program List at Universityof Washington Dept. of Genetics.

Stanford Genomic Resources - genomes, microarray datasets and more.

Whitehead Institute Center for Genome Research - Genome Databases, software.

Discontiguous MegaBlast - NCBI - more sensitive untranslated DNA searches across species.

 

Individual Lab Sequence Analysis Sites

BioProspector (Stanford)

TAP - Transcript Assembly Program (Washington Univ.)

Lab of David States, Univ. of Michigan

 

Cluster and Parallel Computing

NPACI - ROCKS - easily deploy and maintain clusters
Condor - workload management system for serial or parallel jobs, including Beowulf type clusters. Supports "Grid-style" computing.
Scyld Beowulf Cluster Operating System
Parallel Programming Laboratory - at University of Illinois Urbana-Champaign

CACR Parallel Software Resources List
CACR Parallel Molecular Dynamics Algorithms - Sandia National Labs

Computational Infrastructure Projects

The Globus Project - to enable "integrated, collaborative use of high-end computers, networks, databases, and scientific instrucments owned and managed by multiple organizations.....often require secure resource sharing...."

Web Services Activity - inlcludes SOAP (Simple Object Access Protocol).
Apache SOAP implementation
Globus Grid Web Services - The Open Grid Services Architecture (OGSA) is a proposed evolution of the current Globus Toolkit towards a Grid system architecture based on an integration of Grid and Web services concepts and technologies.

 

Supercomputing Centers

PSC Biomedical Initiative - Supercomputing for Biomedical Applications at the Pittsburgh Supercomputer Center
SDSC - San Diego Supercomputer Center

NBCR - National Biomedical Computation Resource
NPACI - National Partnership for Advanced Computational Infrastructure


NCSA - National Center for Supercomputing Applications,University of Illinois
ABCC - Advanced Biomedical Computing Center, NCI, Frederick, MD
Center for Computational Research, University of Buffalo
North Carolina Supercomputer Center
Ohio Supercomputer Center
National HPCC Software Exchange - list of other HPCC sites

 

Examples of other Computational Service Centers and Consortia

Oxford University Bioinformatics Centre
CSC - Finland
North Carolina Bioinformatics and Genomics Consortium
CAIP - Center for Advanced Information Processing, Rutgers University
Computational Biology Service Unit - Cornell Theory Center

CACR - Center for Advanced Computing Research, CalTech
KISAC - Karolinska Institute Sequence Analysis Computer
BimCore - Emory University

Large-Scale Data Integration

LSID - Life Sciences Identifier. IBM reference implementation of a standard developed by the Interoperable Informatics Infrastructure Consortium (I3C). Facilitiates data interoperability between sites and applications, using Web Services.

CABIO - cancer Bioinformatics Infrastructure Objects (NCI/NIH)

 

Maps

New York City Subway Map
CPMC aerial photo
Regional Roadmap
Russ Berrie Building sitemap
Major BioResearch Facilities in NYC Metropolitan Area

 

 
Suggestions & Problems? Send e-mail to the Webmaster