My Microarray Software Comparison - Data Mining Software (Specific analysis software)



Go back to the software category index page

Please contact me if you have any suggestion on this list

Definition of data mining software - specific analysis software
Suggested readings
Data mining software - specific analysis software (in alphabetical order)

[ 50-50 MANOVA | Affymate | AIDA Array Compare | ANOVA programs for microarray data | ArrayMiner | Cleaver | Cluster Identification Tool (CIT) | CLUSFAVOR (CLUSter and Factor Analysis Using Varimax Orthogonal Rotation) | Coupled Two-Way Clustering (CTWC) | Cyber T | FDR controlling procedures | FEXAT | Gene Cluster | General Hidden Markov Model library (GHMM) | GeneViz | GIMM | INCLUSive | LACK | Machaon Clustering and Validation Environment (CVE) | Microhelper | Multi microarray normalisation | NIA Microarray ANOVA Tool | Onto-Express | Open source clustering software | PAM (Prediction Analysis for Microarrays) | Probe Profiler | R cluster | SAM (Significance Analysis of Microarrays) | supervised Network Self-Organized Map (sNet-SOM) | SNOMAD: Standardization and NOrmalization of MicroArray Data | SparseLOGREG | TableView | Venn Mapper | VERA & SAM ]


   

Definition of data mining software - specific analysis software

A specific analysis software is defined as a software which performs only one analysis or a few specific analyses. The distinction between comprehensive and specific analysis software is not clear-cut, but in general a specific analysis software is more specialized in a particularly confined analytical problem, while a comprehensive software aims at providing an all-in-one package for the general user.

Suggested readings

    Experimental Design
  1. Glonek GF, Solomon PJ. Factorial and time course designs for cDNA microarray experiments. Biostatistics. 2004 Jan;5(1):89-111. [PubMed]
  2. Simon RM, Dobbin K. Experimental design of DNA microarray experiments. Biotechniques. 2003 Mar;Suppl:16-21 [PubMed]
  3. Kerr MK. Experimental design to make the most of microarray studies. Methods Mol Biol. 2003;224:137-47. [PubMed]
  4. Yang YH, Speed T. Design issues for cDNA microarray experiments. Nat Rev Genet. 2002 Aug;3(8):579-588. [PubMed][full text][pdf][web supplement]
  5. Lee ML, Whitmore GA. Power and sample size for DNA microarray studies. Stat Med. 2002 Dec 15;21(23):3543-70. [PubMed]
  6. Hwang D, Schmitt WA, Stephanopoulos G, Stephanopoulos G. Determination of minimum sample size and discriminatory expression patterns in microarray data. Bioinformatics. 2002 Sep;18(9):1184-93. [PubMed]
  7. Simon R, Radmacher MD, Dobbin K. Design of studies using DNA microarrays. Genet Epidemiol. 2002 Jun;23(1):21-36. [PubMed]
  8. Kerr MK, Churchill GA. Experimental design for gene expression microarrays. Biostatistics. 2001 Jun;2(2):183-201. [PubMed]
    Data mining
  1. Leung YF, Cavalieri D. Fundamentals of cDNA microarray data analysis. Trends Genet. 2003 Nov;19(11):649-59. [PubMed]
  2. Smyth GK, Yang YH, Speed T. Statistical issues in cDNA microarray data analysis. Methods Mol Biol. 2003;224:111-36. [PubMed]
  3. Nadon R, Shoemaker J. Statistical issues with microarrays: processing and analysis. Trends Genet. 2002 May;18(5):265-71. [PubMed]
  4. Sherlock G. Analysis of large-scale gene expression data. Brief Bioinform. 2001 Dec;2(4):350-62. [PubMed]
  5. Wu TD. Analysing gene expression data from DNA microarrays to identify candidate genes. J Pathol. 2001 Sep;195(1):53-65. [PubMed]
  6. Quackenbush J. Computational genetics computational analysis of microarray data. Nat Rev Genet. 2001 Jun;2(6):418-27. [PubMed][pdf]
  7. Brazma A, Vilo J. Gene expression data analysis. FEBS Lett. 2000 Aug 25;480(1):17-24. [PubMed]



Data mining software - specific analysis software (in alphabetical order)

 
Product Company/ Institute Interface/ Operating System Features Price Remarks
50-50 MANOVA
MATFORSK, Norwegian Food Research Institute Windows
New MANOVA method that handles collinear responses. Calculates adjusted
p-values in general linear models by rotation testing. False discovery rate function will be added to the program in the future.
Free
Download (50-50 MANOVA), (Rotation Tests)
Affymate Array Genetics web Affymate  is designed to rapidly analyze multiple pairs of DNA microarray data  no pricing information is available Demo;
AIDA Array Compare Raytest GmbH windows provides data of the comparison of one master array with client arrays no pricing information is available
ANOVA programs for microarray data Churchill Statistical Genetics Group; The Jackson Laboratory Matlab performs the analysis of variance on microarray data. Free reference 1 [pdf]; reference 2 [pdf]
ArrayMiner 2 Optimal Design Windows/ Mac/ add-on to GeneSpring Proprietary genetic algorihm for clustering; a number of visualization tools; a new class of clustering algorithm is available in verion 2 check here demo download; manual;paper compare with K-means; version 2 white paper; also available as optional add-on to GeneSpring 
Cleaver 1.0 Stanford Biomedical Informatics Web Classification (discriminant analysis), K-means clustering, PCA Free documentation; reference [PubMed]
Cluster Identification Tool (CIT) Van Andel Research Institute Windows Statistical discrimination metric and permutation analysis to identify clusters of genes or individual genes that best differentiate experimental groups Free integrate with Cluster and Treeview; download; sample data; supplemental document
CLUSFAVOR 6.0 (CLUSter and Factor Analysis Using Varimax Orthogonal Rotation) Molecular Biology Computation Resource, Baylor College of Medicine Windows 95/98/NT/2000/XP performing cluster and factor analysis Free for academic user download;user guide; features;faq; troubleshooting; reference [PubMed]
Coupled Two-Way Clustering (CTWC) Department of Physics of Complex Systems, Weizmann Institute of Science web  Coupled Two-Way Clustering Free for academic user registration is required; reference [PubMed][pdf]; algorithm;
server: Reference [PubMed]
Cyber T UC Irvine Linux/ Unix with R statistical language; or use their web interface t-test for statistically significant differences between sample sets for arrays; Bayesian probabilistic framework to estimate the variance among replicates Free help;tutorial;download only R library; how to install; download entire web interaface; CyberT has been incorporated into the GeneX database and analysis package; reference [PubMed]
FDR controlling procedures Anat Reiner,   Daniel Yekutieli and
Yoav Benjamini
Windows
adjusts p-values generated in multiple hypothesis testing of gene expression data obtained by cDNA microarray experiment. Free
download; source code;  reference [PubMed][doc]
FEXAT Kraft P, Schadt EE, Aten J, Horvath S Linux
A family-based test for correlation between gene expression and trait values Free
download; Reference [PubMed];Help file
Gene Cluster 2.0 Whitehead Institute Centre for genome research JAVA filter and preprocess data in a variety of ways; Self-Organizing Map; unsupervised classification by weighted voting (WV) and k-nearest neighbors (KNN) algorithms, gene selection and permutation test methods Free for academic user download;manual;faq
General Hidden Markov Model library (GHMM)
Max Planck Institute for Molecular Genetics, Department of Computational Molecular Biology, University of Cologne C library with an additional C++-API hidden Markov models to analyze gene expression time course data.

LGPL Reference [PubMed];
GeneViz ContentSoft AG Windows Double Conjugated Clustering (DCC) - cluster simultaneously samples and genes; Singular Value Decomposition Sorting (SVD) no pricing information is available demo available; publications;brochures;
GIMM
University of Cincinnati Medical Center
Windows
A clustering procedure based on the concept of Bayesian model-averaging and a precise statistical model of expression data
GNU GPL
download; source code; Reference 1 [PubMed]; Reference 2 [PubMed]
GLR
Wang S, Ethier S. Department of Mathematics, University of Utah.
Windows
GLR is a statistical analysis program to identify differentially expressed genes from microarray data. It implements a generalized likelihood ratio test based on the two-component model Free
download; Reference [PubMed]
INCLUSive Katholieke Universiteit Leuven web A suite of web based tools and is aimed at the automatic multistep analysis of microarray data (clustering and motif     finding). Currently, adaptive quality-based clustering, retrieval of upstream sequences and the motif sampler are accessible from this website. Free demo
LACK Charles C Kim & Stanley Falkow, Stanford University
Windows & perl source code
calculating the statistical significance of apparent lexical bias in microarray datasets Free
Download;perl source code; manual;sample data; Reference [PubMed][pdf]
Machaon Clustering and Validation Environment (CVE)
Nadia Bolshakova
Windows
a data mining system, which allows the application of different clustering and cluster validity algorithms for DNA microarray data.
Free?
Program available upon request
Microhelper 1.02 Chang Bioscience Windows NT/ Mac OS X A tool for merging, filter, normalize, transform, handle missing, select subset, remove control and annotate data US$65 Demo available (windows)(mac)
Multi microarray normalisation
Keith Vass and Ernst Wit
Web
An ANOVA based normalization of dye-swapped experiment, taking pin-tip effect into account
Free
Detailed description
NIA Microarray ANOVA Tool
National Institute on Aging (NIA)NIH
Windows, Sun Solaris, Linux
A web-based tool for Analysis of Variance (ANOVA) of gene microarray data.
Free?
Help; download
Onto-Tools
Intelligent Systems and Bioinformatics Laboratory, Computer Science Department, Wayne State University
Web
Onto-Tools is a set of four  integrated databases: Onto-Express: translate differentially regulated genes into functional profiles , Onto-Compare: comparisons of any sets of commercial or custom arrays, Onto-Design: select genes that represent given functional categories and Onto-Translate: translate ists of accession numbers, UniGene clusters and Affymetrix probes into one another. Free
Registration required; Reference [PubMed]
Open source clustering software
Laboratory of DNA Information Analysis of
Human Genome Center, Institute of Medical Science,
University of Tokyo
.
Windows, Mac OS X, Linux, Unix
k-means clustering, hierarchical clustering and self-organizing maps in a single multipurpose open-source library of C routines, callable from other C and C++ programs.
covered by the original Cluster/TreeView license. Reference [PubMed][pdf]
PAM (Prediction Analysis for Microarrays) Tibshirani Lab
Department of Statistics,
Stanford University
Excel add-in/ R-package Performs sample classification from gene expression data, Estimates prediction error via cross-validation, Provides a list of significant genes whose expression characterizes each diagnostic class  Free for academics user Excel add-in coming soon; reference [PubMed][pdf]
Probe Profiler
Corimbia Inc.
Windows
Assess Affymetrix data quality assessing which chips or probe sets are bad, analyze groups of chips
Not Available
brochure and references
R cluster National Center for Genome Resources (NCGR) GeneX analysis server Web Web interface to a collection of clustering routines written in the R statistical programming language Free help; tutorial ; A demo of permutation based clustering is available 
SAM (Significance Analysis of Microarrays) Tibshirani Lab
Department of Statistics,
Stanford University
Excel add-in/ Web Correlates gene expression data to a wide variety of clinical parameters including  treatment, diagnosis categories, survival time and time trends; Provides estimate of False Discovery Rate for multiple testing Free for academics user excel add-in (registration needed);
[PubMed][pdf];
supervised Network Self-Organized Map (sNet-SOM) Department of Medical Physics, School of Medicine,University of Patras, Greece Source code available The sNet-SOM determines adaptively the number of clusters with a dynamic extension process. This process is driven by an inhomogeneous measure that tries to balance unsupervised, supervised and model complexity criteria. ? Reference [PubMed]
SNOMAD: Standardization and NOrmalization of MicroArray Data Pevsner Lab;Johns Hopkins University School of Public Health Web a collection of algorithms directed at the normalization and standardization of DNA microarray data; The majority of the transformations within NOMAD are
 directed at the refinement of paired microarray data.
Free -
SparseLOGREG S. K. Shevade and S. S. Keerthi Linux/Unix
a new and efficient algorithm for the sparse logistic regression problem which can be applied to a variety of real-world problems like identifying marker genes and building a classifier in the context of cancer diagnosis using microarray data Free?
Reference [PubMed]
TableView
Center for Computational Genomics and Bioinformatics, University of Minnesota
Java
TableView is a generalized scientific visualization program for exploration of various biological data, including EST, SAGE, microarray and annotation data.
Free
Reference [PubMed]
Venn Mapper
Smid M, Dorssers LC, Jenster G. Department of Pathology, Josephine Nefkens Institute. The Netherlands. Windows
Venn Mapper is a program that compares heterologous microarray data sets, based on the number of common, differentially expressed genes.
Free
Download; Manual; Reference [PubMed]
VERA & SAM Institute for Systems Biology Windows/ Linux/ Unix VERA - Variability and ERror Assessment (Estimates error model parameters from replicated, preprocessed experiments.)
SAM - Significance of Array Measurement (Uses error model to improve the accuracy of the expression ratio and to assign a value 'lambda' to each gene,  indicating the likelihood that the gene is differentially expressed.)
Free download;windows documentation; UNIX documentation; source code;


last updated: 10 Sep 2004
home