Journals
ICSB
list of Bioinformatics Journals
Bioinformatics Course Materials
(External Sites)
ch.EMBnet.org
Introduction to Bioninformatics 2003 - Course Materials
(Swiss EMBnet node, Swiss Insitute of Bioinformatics).
Pittsburgh Supercomputer Center Biomedical Initiative
- Sequence
Analysis Tutorial.
Terry
Speed's Lab - Slide presentations on linkage
analysis, sequence analysis, gene expression.
Genome Databases and Browsers
Ensembl
- Human, Mouse,
Rat, Zebrafish, Fugu, Mosquito, Fruit fly, C.
elegans, C. briggsae.
EnsMart
- (EBI) - "EnsMart is a powerful data mining
toolset for retrieving customised data sets from
annotated genomes, using criteria from Ensembl and
third party databases".
"Integrates gene and protein annotation,disease
information, expression data, sequence variation
and cross-species analyses. Provides homology, SNPs
affecting proteins, retrieval by external identifiers,
retrieval by expression (controlled vocabulary),
customised sequence datasets and microarray annotation
tools".
NCBI
Human Genome,
Mouse Genome
NCBI
- Other Genomes - Fruit fly, Malaria parasite,
Microbial, Plants, Rat, Retroviruses, Zebrafish.
UCSC
Genome Browser - Human, Mouse and Rat.
TIGR - Comprehensive
Microbial Resource.
Wormbase
- Caenorhabditis elegans (Other
Model Organisms)
Rat
Genome Database - includes comparative map of
rat vs mouse vs human.
GALA
- Genome Alignment and Annotation Database-
Penn State University. This database incorporates
information about genes, SNPs, alignments, disease
association and gene expression levels. Data is
from multiple sources such as GenBank, Ensembl,
The Whitehead Institute, and The Human Genome Browser.
SGD
- Saccharomyces
Genome Database - Stanford University. Genome
sequences of several species of yeast.
GO
- Gene Ontology Consortium. - "The goal
of the Gene OntologyTM Consortium is to produce
a dynamic controlled vocabulary that can be applied
to all organisms even as knowledge of gene and protein
roles in cells is accumulating and changing."
GOA
- Gene Ontology Annotation - the GO vocabulary
applied to a non-redundant set of proteins described
in the Swiss-Prot, TrEMBL and Ensembl databases....
Genome-Genome Comparison
Twinscan
- informant based gene prediction (Washington University).
Description from the TWINSCAN website:
"TWINSCAN is a gene prediction system that
models both gene structure and evolutionary conservation.
The scores of features like splice sites and coding
regions are modified using the patterns of divergence
between the target genome and a closely related
genome.....Currently, we believe that TWINSCAN
is the best available gene prediction program
for mammalian genomes and for Cryptococcus neoformans.
For Arabidopsis TWINSCAN and FGENESH have similar
performance, but TWINSCAN is slightly better at
getting gene boundaries right. Other programs,
such as GENSCAN and GeneMark.HMM do not perform
as well in our tests. For C. elegans, TWINSCAN
and GENEFINDER have roughly similar performance".
Vista
- Visualization Tool for Alignments. VISTA is a
set of tools for comparative genomics. It was designed
to visualize long sequence alignments of DNA from
two or more species with annotation information.
MUMmer
- MUMmer is a system for aligning entire genomes
extremely rapidly.
Pipmaker
- Penn State Bioinformatics
Group. PipMaker computes alignments of similar
regions in two DNA sequences. The resulting alignments
are summarized with a ``percent identity plot'',
or ``pip'' for short. MultiPipMaker allows the user
to see relationships among more than two sequences.
(Notes
from the Genome Comparison section (using PipMaker)
of the Cold Spring Harbor Workshop
on Computational Genomics, November 2002).
Examining
the Current Problems of Whole Genome Comparison:
A Review by Patrick
Chain, Stanford Univ.
ACT
- ACT (Artemis Comparison Tool) is a DNA sequence
comparison viewer based on Artemis.
(Sanger Center).
Doublescan
- Doublescan is a program for comparative ab initio
prediction of protein coding genes in mouse and
human DNA (Sanger Center).
More Databases
OMIM
- Online Mendelian Inheritance in Man
PDB
- Protein Data Bank
Allgenes.org
- structured database of predicted genes in human
and mouse.
HGVBase (Human Genome Variation Database) - Karolinska Institute - Curated mutation
and
SNP database. Also, links to many other SNP databases.
BIND
- Biomolecular Interaction Network Database.
Links
to Pathway and other Databases
The SNP Consortium
GDB
- The Genome Database
Genetic
Association Database - "The Genetic Association
Database is an archive of human genetic association
studies of complex diseases and disorders. The goal
of this database is to allow the user to rapidly
identify medically relevant polymorphism from the
large volume of polymorphism and mutational data,
in the context of standardized nomenclature".
TRANSFAC
- The Transcription Factor Database. Other databases
at this site:
TRANSPATH - database on gene-regulatory
pathways
CYTOMER - database of physiological systems, organs
and cell types
S/MARtDB - collects information about scaffold/matrix
attached regions and the nuclear matrix proteins
that are supposed be involved in the interaction
of these elements with the nuclear matrix. It
covers the whole range from yeast to human.
Computational Resources
NCBI
TIGR
European Bioinformatics
Institute - services
page
Sanger Center
HGMP-RC
HGMP-RC
Bioinformatics Services
Washington Univ.
Genome Sequencing Center
EMBL
Computational Services
SWISS-PROT/EBI
Swiss
EMBnet (a node of the European
Molecular Biology Network) -
Provides web interfaces
to a number of programs. including the very useful
T-Coffee
multiple sequence alignment program, with
color coding of similarity in the output.
IBM
Bioinformatics and Pattern Discovery Group -
Server hosts many novel algorithms.
GeneLynx
- Portal provides single webpage summary of publicly
available information about human and mouse genes.
SOURCE
- Stanford Online Universal Resource for Clones
and ESTs. "...pools publicly available data
commonly sought after for any clone, GenBank accession
number, or gene. SOURCE was specifically designed
to facilitate analysis of the large data sets that
biologists can now produce using genome-scale experimental
approaches".
BCM
Search Launcher - web based sequence manipulation
utilities (format conversion, repeat masking, 6
frame translation etc.).
BiBiServe
- Bielefeld University Bioinformatics Server
bioinformatik.de
- lots of links....
bioinformatik.de
- Genome Projects
CGAP-GAI -
Cancer Gene Anatomy Project - Genetic Annotation
Initiative - Cancer SNP resources.
Publicly Available SRS Servers (with
large numbers of databases)
List
of Public SRS servers
HGMP (Hinxton)
EBI
(Hinxton)
Sanger
Center (Hinxton)
DKFZ
(Heidelberg)
Protein Signature and Domain Database
Search Tools
InterProScan
- Search InterPro. InterPro provides an integrated
view of the commonly used signature databases (PROSITE,
Pfam, PRINTS etc (see
full list)), and has an intuitive interface
for text- and sequence-based searches.
PROSITE
- a database of protein families and domains. It
consists of biologically significant sites, patterns
and profiles that help to reliably identify to which
known protein family (if any) a new sequence belongs.
Pfam
- Search or browse the Pfam database. Pfam is a
large collection of multiple sequence alignments
and hidden Markov models covering many common protein
domains and families. For each family in Pfam you
can:
* Look at multiple alignments
* View protein domain architectures
* Examine species distribution
* Follow links to other databases
* View known protein structures
PRINTS
- a compendium of protein fingerprints. A fingerprint
is a group of conserved motifs used to characterise
a protein family.
BLOCKS
- Blocks are multiply aligned ungapped segments
corresponding to the most highly conserved regions
of proteins (Blocks
Searcher).
ProDom
- ProDom is a comprehensive set of protein domain
families automatically generated from the SWISS-PROT
and TrEMBL sequence databases
SMART
- Simple Modular Architecture Research Tool (Protein
domain search tool).
TIGRFAMS
- TIGRFAMs are protein families based on Hidden
Markov Models or HMMs.
ExPASy
Molecular Biology Server - Expert Protein Analysis
System (Swiss Inst. of Bioinformatics)
Superfamily
- HMM library and genome assignments server - MRC
LMB, Cambridge, UK
See also:
Swiss-Prot
- The Swiss-Prot Protein Knowledgebase is a curated
protein sequence database that provides a high level
of annotation, a minimal level of redundancy and
high level of integration with other databases.
TrEMBL
- The TrEMBL database contains the translations
of all coding sequences (CDS) present in the EMBL
Nucleotide Sequence Database, which are not yet
integrated into Swiss-Prot.
Protein Structure Prediction
PredictProtein
- Secondary structure prediction, sequence analysis
and fold recognition by prediction-based threading.
At CUBIC
(Burkhard Rost).
Jpred
- Runs JNet algorithm for secondary structure prediction.
Univ. of Dundee, U.K.
NPRS@
- Network Protein Sequence Analysis - Can run a
number of prediction programs and display a consensus
prediction. IBCP, Lyon, France. See also collection
of algorithm
descriptions.
Protein
Structure Prediction - list of public links
at ALMA Bioinformatica.
Microarray Databases
NCBI
- Gene Expression Omnibus
- Microarray data repository.
EBI
- microarray resources
(ArrayExpress etc).
Microarray Analysis
Terry
Speed's Lab (Berkeley)
- Statistical considerations in microarray analysis.
Pathway Analysis for Microarray
Data
GenMAPP
- Gene Microarray Pathway Profiler (UCSF)
PC-based Free Genetic and Sequence
Analysis Programs
Bio-Edit
- "a biological sequence alignment editor written
for Windows 95/98/NT. A rich, intuitive multiple
document interface with many convenient features
makes alignment, manipulation and viewing of sequences
relatively quick and easy on your desktop computer".
MEGA
- Molecular Evolutionary Genetics Analysis. Powerful
desktop tool for building phylogenetic trees and
many other genetic data analysis tasks.
Selected Bioinformatics
and Computational Biology Sites at Columbia University:
Columbia
Genome Center (contains links to numerous faculty
and staff)
C2B2
- Center for Computational Biology and Bioinformatics
(courses)
Columbia
University Bioinformatics Center (Headed by
Dr.
Burkhardt Rost)
Oncoinformatics
Core at CPMC
Professor
Dimitris Anastassiou (Electrical Engineering,
CGC)
Dr.
Harmen Bussemaker (Biology, CGC)
Dr. Andrea Califano (Dept. of Biomedical Informartics)
Professor
Barry Honig (Biochem. and Biophysics, CGC)
Dr.
Christina S. Lesli (Computer Science)
Dr.
Paul Pavlidis (Medical Informatics, CGC)
Dr.
Burkhardt Rost (Biochem and Biophysics, CU Bioinformatics
Center)
Dr.
Andrey Rzhetsky (Medical Informatics, CGC)
Dr. Chris
Wiggins (Applied Math)
Dr.
An-Suei Yang (Pharmacology, CGC) (PrISM
modeling system)
Computational
Biology Group
Center
for Biomolecular Simulation
Peisen Zhang (CGC)- Gene
Network Online - Contains information on protein-protein
interactions collected from multiple sources. It
can be searched to find possible pathways linking
two target genes, or to find genes which may interact
with the target gene. HDL
project website with links to locally written
SNP and haplotype analysis software.
New York Bioinformatics Sites
and People (just a small sampling)
Albert Einstein College of Medicine
Lab
of Dr. Mark Chance
Buffalo Center of Excellence in
Bioinformatics
Lab
of Jeff Skolnick
Cold Spring Harbor Laboratory
Lab
of Dr. Lincoln Stein
Lab
of Dr. Michael Zhang
Hazen
Genome Sequencing Center
Memorial Sloan Kettering Cancer
Center
Computational
Biology Center
Mount Sinai School of Medicine
Dr. Fabien Campagne, Institute
for Computational Biomedicine
Lab
of Angel Ortiz
Lab
of Dr. Roberto Sanchez
New York University
Lab
of Bud Mishra (also Cold Spring Harbor)
New York University School of
Medicine
Dr.
Stuart Brown, RCR
Rockefeller University
Lab
of Dr. Terry Gaasterland
SUNY Stony Brook
Lab
of Dr. Moises Eisenberg
Lab of
Ilya Vaksar Molecular Modeling and Structural
Bioinformatics
Wadsworth
Center - New York State Dept. of Health
Bioinformatics
Institute - (with RPI)
Genomics
Institute -
Weill-Cornell Medical Collge
Lab
of Dr. Diana Murray
Computational
Genomics Core Facility
Cornell University
Lab
of Dr. Ron Elber at Cornell (Protein Modeling
and Parallel Computation)
Computational
Biology Program at Cornell
Computational
Biology Tools at the Cornell Theory Center
(NIH "Parallel Processing Resource for Biomedical
Scientists")
Other
New
York Structural Genomics Research ConsortiumNew
York Biotechnology Association - a trade group.
Lab of
Dr. Andrej Sali (Multiple structural databases
and modeling programs).
AMDeC Related Sites
Center
for Computational Research, University of Buffalo
NYSERNet
- Columbia's connection to the Internet and Internet2
Internet2
- Members
Internet2
- Abilene Network (NYSERNet is on Abilene)
Internet2
- Abilene Network NOC - Maps and Participant lists
Resources
TIGR
- software (Microarray, Assembler, Gene Finding,
Annotation etc)
CMS-MBR
- Large list of Bioinformatics links.
Northeast
Structural Genomics Consortium
SNP
Consortium Linkage Map Project
Phylogeny
Program List at
Universityof Washington Dept. of Genetics.
Stanford
Genomic Resources - genomes, microarray datasets
and more.
Whitehead
Institute Center for Genome Research - Genome
Databases, software.
Discontiguous
MegaBlast - NCBI - more sensitive untranslated
DNA searches across species.
Individual
Lab Sequence Analysis Sites
BioProspector
(Stanford)
TAP
- Transcript Assembly Program (Washington Univ.)
Lab
of David States, Univ. of Michigan
Cluster and Parallel Computing
NPACI - ROCKS
- easily deploy and maintain clusters
Condor
- workload management system for serial or parallel
jobs, including Beowulf type clusters. Supports
"Grid-style" computing.
Scyld
Beowulf Cluster Operating System
Parallel
Programming Laboratory - at University of Illinois
Urbana-Champaign
CACR Parallel
Software Resources List
CACR Parallel
Molecular Dynamics Algorithms - Sandia National
Labs
Computational Infrastructure Projects
The
Globus Project - to enable "integrated,
collaborative use of high-end computers, networks,
databases, and scientific instrucments owned and
managed by multiple organizations.....often require
secure resource sharing...."
Web
Services Activity - inlcludes SOAP (Simple Object
Access Protocol).
Apache
SOAP implementation
Globus
Grid Web Services - The Open Grid Services Architecture
(OGSA) is a proposed evolution of the current Globus
Toolkit towards a Grid system architecture based
on an integration of Grid and Web services concepts
and technologies.
Supercomputing Centers
PSC
Biomedical Initiative - Supercomputing for Biomedical
Applications at the
Pittsburgh Supercomputer
Center
SDSC - San Diego
Supercomputer Center
NBCR
- National Biomedical Computation Resource
NPACI - National
Partnership for Advanced Computational Infrastructure
NCSA - National
Center for Supercomputing Applications,University
of Illinois
ABCC
- Advanced Biomedical Computing Center, NCI, Frederick,
MD
Center for
Computational Research, University of Buffalo
North Carolina Supercomputer
Center
Ohio Supercomputer
Center
National
HPCC Software Exchange - list
of other HPCC sites
Examples of other Computational
Service Centers and Consortia
Oxford
University Bioinformatics Centre
CSC
- Finland
North Carolina Bioinformatics
and Genomics Consortium
CAIP
- Center for Advanced Information Processing, Rutgers
University
Computational
Biology Service Unit - Cornell
Theory Center
CACR
- Center for Advanced Computing Research, CalTech
KISAC - Karolinska
Institute Sequence Analysis Computer
BimCore
- Emory University
Large-Scale Data Integration
LSID
- Life Sciences Identifier. IBM reference implementation
of a standard developed by the Interoperable Informatics
Infrastructure Consortium (I3C). Facilitiates data
interoperability between sites and applications,
using Web Services.
CABIO
- cancer Bioinformatics Infrastructure Objects (NCI/NIH)
Maps
New
York City Subway Map
CPMC
aerial photo
Regional
Roadmap
Russ
Berrie Building sitemap
Major
BioResearch Facilities in NYC Metropolitan Area