Bioinformatics Shared Resources

Sanford Burnham Prebys Medical Discovery Institute


MICROARRAY RESOURCES

GeneSpring demo  is available here: 

Here is a link to a well researched article discussing the design and analysis of microarray experiements: Microarray Guide

Download presentation from Roy Williams' Feb 7th 2007 Burnham BSR seminar: Microarrays Overview

The shared in-house service for processing illumina microarrays is here: Microarray Facility

AffyComp II software: http://affycomp.biostat.jhsph.edu
A free online microarray analysis course from the University of Alabama at Birmingham: http://www.soph.uab.edu/ssg_content.asp?id=1410
ArrayExpress microarray data repository: http://www.ebi.ac.uk/arrayexpress
BioConductor open source software for bioinformatics:  http://www.bioconductor.org

Cyber-T statistics program: http://visitor.ics.uci.edu/genex/cybert/index.shtml

ermineJ — Gene Ontology analysis for microarry data: http://microarray.genomecenter.columbia.edu/ermineJ

Gene Expression Omnibus data repository:  www.ncbi.nlm.nih.gov/geo

Gene Ontology Database: www.geneontology.org

HDBStat! High Dimension Biology Statistical analysis software:  http://www.soph.uab.edu/ssg_content.asp?id=1164

MAANOVA 2.0 software:  http://www.jax.org/staff/churchill/labsite/software/anova

PowerAtlas software:  www.poweratlas.org

Stanford MicroArray Database:  http://genome-www5.stanford.edu


Current popular techniques (and some not so popular): 
  1. GeneSpring: shared resources can help support your GeneSpring needs via remote desktop.
  2. NextBio: Cutting edge very large scale data analysis tool. Eg: The sheer volume of large-scale information across multiple types of cancer at different stages of tumor development provides an unprecedented scientific opportunity and at the same time - a daunting challenge for researchers. In this paper we demonstrate the use of NextBio to study cancer across different stages of tumor progression in order to identify biomarkers of tumorigenesis.....
  3. GeneSet enrichment analysis (GSEA). Tool  for detecting differentially expressed pathways between samples.  The application automatically creates a zipped results package of graphs, lists and plots. User friendly and well executed. 
  4. Non-negative matrix factorisation (NMF). An incredibly powerful and state-of-the-art clustering algorithm, which is also used for image recognition. Great for  categorizing cell lines or tumors on the basis of gene expression data. The software comes as free modules for the R based Bioconductor, or as a  plugin for the free package GenePattern.  GenePattern is also a rather nice piece of web deployable new software for data analysis, building analysis pipelines and world wide collaboration. 
  5. All the data normalisation tools available in R  and Bioconductor (about 6 or 7; linear and non-linear) - normalisation results can now be quality controlled using the package maCorrPlot which checks the normalised data for randomly picked gene-to- gene expression pattern correlations (there should be basically  none).  People like maCorrPlot  since it gives a very powerful overview of data processing.

References: 
  1. Bioconductor: Open software development for computational biology and bioinformatics Genome Biology 
    5 2004 R80 Robert C Gentleman and Vincent J. Carey and Douglas M. Bates and Ben Bolstad and Marcel Dettling and
    Sandrine Dudoit and Byron Ellis and Laurent Gautier and Yongchao Ge and Jeff Gentry and Kurt Hornik and Torsten Hothorn and Wolfgang
    Huber and Stefano Iacus and  Rafael Irizarry and Friedrich Leisch Cheng Li and Martin Maechler and Anthony J. Rossini and Gunther Sawitzki and Colin Smith and Gordon Smyth and Luke Tierney and Jean Y. H. Yang and  Jianhua Zhang, 
    http://genomebiology.com/2004/5/10/R80 
  2. Ploner A, Miller LD, Hall P, Bergh J and Pawitan Y. (2005) Correlation test to assess low-level processing of high-density oligonucleotide microarray data. BMC Bioinformatics, 6:80.
    Smyth, G. K. (2005). Limma: linear models for microarray data. In: /Bioinformatics and Computational Biology Solutions using R and Bioconductor/, R. Gentleman, V. Carey, S. Dudoit, R. Irizarry, W. Huber (eds.), Springer, New York, pages 397-420. 
  3. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Subramanian, A., Tamayo, P., Mootha V., Mukherjee, S., Ebert, B., Gillette, M., Paulovich. A., Pomeroy, S., Lander, E., Mesirov, J., PNAS 102 43 15545-15550 
  4. Jean-Philippe Brunet, Pablo Tamayo, Todd R. Golub, and Jill P. Mesirov 
    *Metagenes and molecular pattern discovery using *matrix* *factorization**
    PNAS 2004 101: 4164-4169; published online before print as 10.1073/pnas.0308531101 
  5. Background on microarray time course data analysis: 
    *Yu Chuan Tai* and *Terence P. Speed* (2005) Statistical analysis of microarray time course data. In: DNA Microarrays, U. Nuber (ed.), BIOS Scientific Publishers Limited, Taylor & Francis, 4 Park Square, Milton Park, Abingdon OX14 4RN, Chapter 20. Amazon 
  6. Y. C. Tai and T. P. Speed. A multivariate empirical Bayes statistic for replicated microarray time course data. Annals of Statistics, 2005b. To appear. 

Links & Downloads: