Please contact me if you have any suggestion on this list
Definition
of R packages for microarray analysis
Suggested readings
General R
packages useful for microarray analysis (in alphabetical order)
[ amap | cclust | cluster | e1071 | mclust | multiv | mva ]
R packages for microarray analysis (in alphabetical order)
[ ANOVA model for time course experiment | affy | BioConductor | BUM | CGH-Miner | CTC | CyberT | Emerging Patterns | EMV | FDR controlling procedures | FEXAT | GeneClust | GeneSOM | GIN | HighProbability | impute | LogitBoost | mixture modelling | MLE adjustment for signal censoring | PAM | permax | phyloarray | POE | OOMAL | qvalue | R/maanova | SMA | SMA extension | Spot | Statomics | VSN | YASMA ]
| Package | Author | Features | Licence | Remarks |
| amap (Another
Multidimensional Analysis Package) |
Antoine Lucas | amap is a package for
Hierarchical clustering (optimised for memory), generalised PCA and
graphics for PCA. |
? |
download from author's site (unix/linux);manual;
link at CRAN |
| cclust (Convex Clustering Methods and Clustering Indexes) | Evgenia Dimitriadou | Convex Clustering methods, including Kmeans algorithm, On-line Update algorithm (Hard Competitive Learning) and Neural Gas algorithm (Soft Competitive Learning) and calculation of several indexes for finding the number of clusters in a data set. | GNU GPL (version 2 or later) | download (unix/ linux) (windows);index;manual |
| CGH-Miner |
Stanford University |
Identifies DNA copy number
alterations for CGH arrays using the "Cluster Along Chromosomes (CLAC)"
method |
? |
download (windows);
manual;
Reference [pdf] |
| cluster | S original by Peter Rousseeuw, Anja Struyf , Mia Hubert. R port by Kurt Hornik and Martin Maechler. | Functions for cluster analysis | GNU GPL (version 2 or later) | download (unix/ linux) (windows);index;manual |
| e1071 | Evgenia Dimitriadou, Kurt Hornik, Friedrich Leisch, David Meyer, and Andreas Weingessel | Functions for latent class analysis, short time Fourier transform, fuzzy clustering, support vector machines, shortest path computation, bagged clustering, ... | GPL version 2. | download (unix/ linux) (windows);index;manual |
| mclust | C. Fraley and A.E. Raftery. R port by Ron Wehrens |
Model-based cluster analysis | Permission granted for unlimited redistribution for non-commercial use only | download (unix/ linux) (windows);index;manual; website; |
| multiv (Multivariate Data Analysis Routines) | S original by F. Murtagh . R port by Kurt Hornik, Friedrich Leisch and Achim Zeileis | Multivariate Data Analysis Routines including hierarchical clustering, PCA, Sammon mapping, correspondence analysis | Free re-distribution for non-commercial purposes. | download (unix/ linux) (windows);index;manual; |
| mva |
- |
Classical Multivariate Analysis,
contains functions for hierarchical and k-means clustering, PCA,
dendrogram and heatmap drawing |
GNU GPL (version 2 or later) |
a basic component of R |
| Package | Author | Feature | Licence | Remarks |
| ANOVA model for time course experiment | Park T et al., |
A statistical test procedure based on the ANOVA model to identify genes that have different gene expression profiles among experimental groups in time-course experiments. | - |
Reference [PubMed];
available upon request |
| affy (Methods for Affymetrix Oligonucleotide Arrays) | Rafael A. Irizarry, Laurent Gautier, Biostatistics Department; Johns Hopkins University. | The package contains some methods for analyses of affymetrix oligonucleotide array data. | GNU GPL (version 2 or later) | description [pdf]; affy is now a part of the BioConductor project |
| BioConductor | many |
an open source software project with several goals. Main goals: providing infrastructure in terms of design and software for analysing genomic data, some form of graphical user interface for selected libraries and a mechanism for linking together different groups with common goal | GNU GPL (version 2 or later) | current
released packages; current
developmental packages; contributed
packages; faq;Vignettes;
Short
Courses (very useful!); Research
Talks; An excellent introductory tutorial
by Chis Bye; GUI for package Limma; |
| BUM |
Pounds S, Morris SW.
Department of Biostatistics, St. Jude Children's Research Hospital |
Estimating the occurrence of
false positives and false negatives in microarray studies by
approximating and partitioning the empirical distribution of p-values. |
? |
download (S-plus); user guide; reference [PubMed] |
| CTC (Cluster and
Tree Conversion) |
Antoine Lucas |
exports R tables to Xcluster and
Cluster; imports Xcluster and Cluster output to R |
? |
download (unix/linux),
(windows);manual; |
| CyberT | Tony Long and Harry Mangalam (UC Irvine) | t-test for statistically significant differences between sample sets for arrays; Bayesian probabilistic framework to estimate the variance among replicates | GNU GPL (version 2 or later) | download;help |
| Emerging Patterns | Boulesteix AL, Tutz G, Strimmer K. | A CART-based approach to discover EPs in microarray data. The method is based on growing decision trees from which the EPs are extracted. This approach combines pattern search with a statistical procedure based on Fisher's exact test to assess the significance of each EP. Subsequently, sample classification based on the inferred EPs is performed using maximum-likelihood linear discriminant analysis. | GNU GPL | R codes; Readme; examples; reference [PubMed] |
| EMV |
Raphael Gottardo | Estimation of missing values in a matrix by a k-th nearest neighboors algorithm | GPL version 2 or later | download (unix/linux),
(windows);
manual;
reference[PubMed][pdf] |
| FDR controlling procedures | Anat Reiner, Daniel Yekutieli and Yoav Benjamini |
adjusts p-values
generated in multiple hypothesis testing of gene expression data
obtained by cDNA microarray experiment. |
- |
download (R); (S-plus) ;
reference [PubMed][doc] |
| FEXAT | Kraft
P, Schadt EE, Aten J, Horvath S |
A family-based test for
correlation between gene expression and trait values |
free |
download;
reference [PubMed] |
| GeneClust | Kim-Anh Do | GeneClust is a piece of computer software which can be used as a tool for exploratory analysis of gene expression microarray data; hierarchical and gene shaving; Simulation to assess the clustering performance | ? |
Require Unix/Linux or Windows
2000 running S-plus! |
| GeneSOM | Jun Yan | Clustering Genes using Self-Organizing Map | GNU GPL (version 2 or later) | download (unix/ linux) (windows);index;manual; |
| GeneTS | Wichert S, Fokianos K, Strimmer K | some functions useful for microarray time series analysis, in particular cell cycle analysis and inferring graphical models from microarray data. | GNU GPL | download (unix/linux) (windows); reference [PubMed] |
| GIN (Gene Index) |
LeBlanc M et al., |
a gene index technique that generalizes methods that rank genes by their univariate associations to patient outcome. Genes are ordered based on simultaneously linking their expression both to patient outcome and to a specific gene of interest. | - |
download;
Reference [PubMed]; |
| HighProbability |
David R. Bickel | HighProbability estimates which
genes have frequentist or Bayesian probabilities of differential
expression at least as great as a specified threshold, given a list of
p-values. |
Mozilla Public License 1.1
(http://www.mozilla.org/MPL/) |
source;
windows binary;
manual; |
| Impute |
Trevor Hastie, Robert Tibshirani, Balasubramanian Narasimhan, Gilbert Chu | Imputation for microarray data (currently KNN only) | GPL2.0 | download (unix/linux)
(windows);
index;
manual; |
| LogitBoost |
Dettling, Marcel and
Bühlmann, Peter |
a feature preselection method, a
more robust boosting procedure and a new approach for multi-categorical
problems for supervised classification |
Free |
download (unix/linux)
(windows);
manual [ps][pdf]
Reference [PubMed][pdf][ps] |
| mixture modelling | Debashis Ghosh | Mixture modelling of gene expression data from microarray experiments | ? | download; paper (pdf), (ps); require mva and mclust. |
| MLE
adjustment for signal censoring |
Ernst Wit | The function calculates the
maximum likelihood estimate of the parameters for a Gamma(alpha, beta)
pixel intensity model, when only the mean, median variance and number
of pixels are given. |
Free? |
Reference [PubMed] |
| PAM (Prediction Analysis for Microarrays) | Tibshirani Lab, Department of Statistics, Stanford University |
Performs sample classification from gene expression data, Estimates prediction error via cross-validation, Provides a list of significant genes whose expression characterizes each diagnostic class | GPL2.0 | download (unix/linux) (windows);manual; paper (pdf); documentation on nearest shrunken centroid classification; sample plots; reference[pdf] |
| permax | Robert J. Gray | The permax library consists of 7 functions, intended to facilitate certain basic analyses of DNA array data, especially with regard to comparing expression levels between two types of tissue. | GNU GPL 2 | download (unix/linux) (windows);index;manual; |
| phyloarray |
Kurt Sys | Software to process data from phylogenetic or identification microarrays. At present state, it is rather limited and focuss was on a fast and easy way for calculating background values by interpolation and plotting melting curves. The functions for reading the data are similar to those used in package 'sma' (statistical microarray analysis). | GNU GPL 2 | download (unix/linux)
(windows);
index;
manual; |
| POE (Probability of Expression) |
Elizabeth
Garrett, Jiang Hu, Giovanni Parmigiani,
Rob Scharpf |
statistical approaches to
molecular classification that emphasize simple molecular profiles based
on latent categories signifying under-, over-, and baseline-expression. |
GNU GPL 2 |
download (linux);
Reference [PubMed][pdf] |
| OOMAL (Object-Oriented Microarray Analysis Library) (* require S-PLUS!) | MD Anderson Cancer Center, The University of Texas | Object-oriented library for analyzing microarray data in S-PLUS, flexible tools for loading raw quantification data from a variety of microarray formats, normalization, identified differentially expressed genes, classification and discrimination between samples. | ? | download source code; documentation; |
| qvalue |
John D. Storey |
for calculating q-values in
multiple testing situations |
? |
download source
code (please send the author an email with "qvalue download" in the
subject line); manual; |
| R/maanova | Gary Churchill's Statistical Genetics Group, The Jackson Laboratory | R/maanova is an extensible, interactive environment for the analysis of variance on microarray data. | free for academic | registration before download; reference 1[pdf]; reference 2[pdf] |
| SMA (Statisics for Microarray Analysis) | Sandrine Dudoit,Yee Hwa (Jean) Yang, Benjamin Milo BOLSTAD (UC Berkeley) | The package contains some simple functions for exploratory microarray analysis, M-A plots, lowess curve fitting, handles replicate array data by Bayesian methods | GNU GPL (version 2 or later) | download (unix/ linux) (windows);help;index;manual; paper 1,2,3. |
| SMA extension (com.braju.sma) | Henrik Bengtsson | extensions of SMA | ? | download (unix/ linux) (windows); documentation;presentation; requires SMA library and R.classesinstalled |
| Spot | CSIRO Mathematical and Information Sciences | Spot is a software package for the analysis of microarray images; Automatic grid location; Flexible spot segmentation; Morphological background estimation. | Commerical package; price depends on number of users | User guide; installation instruction; Demo version available upon registration |
| Statomics | David Bickel | Statomics is a software suite
for the statistical analysis of genomic and proteomic data. |
? |
source
code; Reference [PubMed][pdf] |
| VSN | Wolfgang Huber; Molecular Genome Analysis National Cancer Research Institute of Germany |
Variance stabilization applied to microarray data calibration and to the quantification of differential expression | Free for academic use | Reference [PubMed][pdf] |
| YASMA (Yet Another Statistical Microarray Analysis) | Lorenz Wernisch and others | correlation between array replicates, ANOVA analysis, p- values for ANOVA analysis, standard t-tests | ? | download(unix/linux);tutorial;related statistical notes; reference [PubMed][pdf] |