Please contact me if you have any suggestion on this list
Definition
of data mining software - comprehensive software
Suggested readings
Data
mining software - comprehensive software (in alphabetical order)
[ Acuity | AMIADA (Analyzing MIcroArray DAta) | ArrayStat | Avadis | BioMine | BRB ArrayTools | Cluster | DNA-arrays analysis tools | DNA-Chip Analyzer (dChip) | Engene | Expression Profiler | ExpressionSieve | GEDA | GeneLinker Gold | GeneLinker Platinum | GeneMaths | GenePattern | GeneSight | GeneSpring | Genesis | GeneWeaver | GeneXPress | GENOWIZARD | GEPAS | J-Express | MAExplorer | Partek software suites | Pathways 4 | ROSETTA | SilicoCyte | TIGR Multiple Experiment Viewer (MEV) | Vector Xpression NTI | XCluster | X-Miner ]
| Product | Company/ Institute | Interface/ Operating System | Features | Price | Remarks |
| Acuity 2.0 | Axon Instruments | Windows 2000/XP client; Windows 2000 server (recommended) |
various visualation tools; normalization, hierarchical, k-means, k-medians clustering with many different similarity metrics, SOM, PCA, gene shaving. Scripting engine for customizable analysis through VBScript, JavaScript or ActiveX objects store data in relational database | US$4,000 (U.S.A. and Canada only) | Full integration with GenePix Pro 4 |
| AMIADA (Analyzing MIcroArray DAta) | Department
of Biology; University of Ottawa |
Windows 98/NT/2000 | organizing, exploring, visualizing, and analyzing microarray data.It features an EXCEL-like user interface and performs data transformation, PCA, a variety of cluster analysis etc. for visualizing expression profiles. | free | download;tutorial; reference [PubMed] |
| ArrayStat 1.0 | Imaging Research Inc. | Windows NT/2000 | normalization, statistical confidence analysis, p values, standard errors, Q-Q plot | no pricing information is available | ppt presentation; overview of statistical inference |
| Avadis | Windows, Linux, Mac OS X, Solaris, Desktop and Client-Server | data import from files and
RDBMS, Affymetrix support, advanced visualization capabilities,
dynamically linked views, probe-level analysis, normalization,
differential expression analysis for variety of experiment designs,
clustering (K-means, Hierarchical, PCA, Eigen-value, SOM, Randomwalk),
classification and prediction (SVM, Decision Trees, Neural Networks,
MVA), automated batched gene annotation, data filtering and scripting
capabilities. |
Available on request | - |
|
| BioConductor | many | Windows, MacOS, Linux/Unix;
requires R-environment |
an open source software project with several goals. Main goals: providing infrastructure in terms of design and software for analysing genomic data, especially microarray data, some form of graphical user interface for selected libraries and a mechanism for linking together different groups with common goal | GNU GPL (version 2 or later) | current released packages; current developmental packages; contributed packages; faq;Vignettes; Short Courses (very useful!); Research Talks; |
| BioMine |
Gene Network Science |
Windows 95/98/NT/2000 |
data import, normalization and
replicate-handling, with Affymetrix GeneChip data handling
capabilities;
various clustering and statistical inference tools; an Experiment
Design
Tool indicates how many replicates are needed to statistically validate
features of interest in microarray data. |
$1000 (academic) per seat
per year $5000 (commercial) per seat per year |
description;datasheet;overview
white paper; |
| BRB ArrayTools 3.1 | Molecular Statistics and Bioinformatics Section, Biometric Research Branch, NCI | Windows | integrated package for the visualization and statistical analysis of DNA microarray gene expression data, normalization, scatterplot, clustering, multidimensional scaling, class prediction | no pricing information is available | download;technical reports and talks |
| Cluster | Michael Eisen's lab;Lawrence Berkeley National Lab (LBNL) | Windows 95/98/NT | hierarchial clustering, K means clustering Self-Organizing Map (SOM), PCA |
Free for academic user | download;manual; source code; demo data; reference [PubMed][pdf][web supplement]; also available from stanford university and microarray.org; the output is visualized by TreeView |
| DNA-arrays analysis tools | National Spanish Cancer Center (CNIO) | web | a suite of web-based programs for DNA array data analysis including two sample correlation plot, hierarchial clustering, SOM, self organising hierarchical neural network (SOTA) and various tree viewers. | Free | - |
| DNA-Chip Analyzer (dChip) | Wong Lab Department of Statistics, Harvard University |
Windows NT/2000 | normalization, model-based expression, filtering and
comparison, clustering |
Free for academic user | reference [PubMed][pdf] |
| Engene | Computer Architecture Department, Universidad de Malaga | Web |
a web-based and platform
independent exploratory data analysis tool for gene expression data
that
aims at storing, visualizing and processing large sets of expression
patterns |
Free access upon request |
user manual;training
data; demo tour;
Reference [PubMed] |
| Expression Profiler | European Bioinformatics Institute EBI | Unix-like httpd server, any web client | clustering, analysis and visualization of gene expression and sequence data. | Free | paper (link to journal) |
| ExpressionSieve | BioSieve | Java, tested on windows 2000, 98, NT, 95, ME, Linux | linking biological significance to expression patterns, data and analysis process management, signature gene discovery and class discovery & prediction |
academic discount available | Modular pricing available |
| Gene Expression
Data Analysis Tool (GEDA) |
UPCI, Center for Pathology Informatics | Web |
A free, open-source,
collaborative online gene expression data analysis web application with
a rich variety of normalization steps, tests for differentially
expressed genes, and sample classification algorithms, with online
recommendation for analysis based on extensive simulation results. |
Free |
Recommendations for Analysis |
| GeneLinker Gold 3.0 | Molecular Mining Corporation | Windows | normalization, missing value estimation, filtering, hierarchical clustering, parititional clustering( k-means and mutual nearest neighbors(Jarvis-Patrick), SOMs, PCA, various distance metrics and visualization functions, Gene Lists for support of pathways, functional classifications & ontologies, links to external databases, hierarchical history of analysis procedures | Commercial = $3,995 Academic = $2,495 | FAQ;Demo upon
registration; Products and Services; Literature; |
| GeneLinker
Platinum 2.0 |
Molecular Mining Corporation | Windows | Platinum offers all the features
listed for GeneLinker Gold plus SLAM algorithm and supervised learning
tools, IBIS (Integrated Bayesian Inference System) algorithm, Linear
Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), and
Uniform/Gaussian Discriminant Analysis (UGDA) classifiers for
inferencing of rules that produce multigenic markers for disease and
tox
response and predict complex phenotypes and gene-gene interactions.
Each
prediction has an associated accuracy percentage and an MSE (mean
squared error) value. |
Contact MMC Sales for pricing |
FAQ;Demo upon
registration; Products and Services; Literature; |
| GeneMaths | Applied Maths | Windows | hierarchial clustering, bootstrap analysis, dendrogram manipulating tools, Self-Organizing Map (SOM), PCA, etc; simple to use, very friendly GUI, fast clustering algorithms, powerful script language, import data from any source or database using ODBC. | US$6200 | Demo Available; brochures; interact with Media Cybernetics ArrayPro seamlessly. |
| GenePattern |
Broad Institute |
Windows, MacOS X, Linux/UNIX |
a flexible analysis
environment, allows users to do
the following, in addition to being a standard microarray analysis
application: Chain analysis tasks into pipelines for reuse and
reproducible
research, Run analyses from a programming language (currently R), Run completely standalone on a laptop or in client-server mode. |
Free; registration required |
Download;
faq;
tutorial;
algorithms;
datasets;
mailing
list |
| GeneSight | BioDiscovery | Windows 9x/ Nt4/ Win2k/ WinXP, Linux, Mac | GenePie visualization, 2-D and 3-D scatter plots, interactive ratio histogram plotting, hierarchical and neural network clustering, PCA, and Time Series Analysis. Significance and confidence analysis. Chromosome Viewer. Annotation Collector. | no pricing information is available | demo upon registration; reference [PubMed] |
| GeneSpring
6.1 |
Silicon Genetics |
Windows; Mac; Linux
|
Analyze various array types, scatter plot, cluster analysis, PCA, SOM, statistic tools, 2D, 3D plotting | no pricing information is available | Product tour; Online demo; trial upon registration; pdf data sheet; citations; reference (link to journal) |
| Genesis | Bioinformatics Group, Institute of Biomedical Engineering, Graz University of Technology | Java, tested on Windows 2000, LINUX, Thru64 UNIX, Solaris and Irix | A Java suite containing various tools such as filters, normalization, visualization tools, common clustering algorithms, SOM, k-means, PCA, SVM, map gene onto chromosomal sequences | Free for academic | download;documentation;licensing;
Reference [PubMed] |
| GeneWeaver | Visual Bioinformatics | Java run under Windows 2000; Oracle 8i | a software and database solution for analysis of gene expression data in an integrated project management environment; can manage data derived from the major types of experimental methods, divided into sequencing-based (EST/SAGE) and hybridization-based (micro/macro arrays) methods. | no pricing information is available | presentation;documentation; |
| GeneXPress |
Stanford University |
Windows, Linux/Unix |
GeneXPress is a visualization
and analysis tool for gene expression data, integrating clustering,
gene annotation, and sequence information. GeneXPress allows you to
load clustering results and automatically analyze them for significance
of functional groups through correlation with functional annotations
(e.g. Gene Ontology) and for enrichment of motif binding sites (e.g.
TRANSFAC motifs). |
Free for academic |
download; tutorials; faq |
| GENOWIZARD | Genotypic Technology |
Windows/ Linux |
Visualization tools,
normalization, reporting functions, excel like interface. |
no pricing information is
available |
brochure; |
| GEPAS (Gene Expression Pattern
Analysis Suite) |
Bioinformatics
Unit; CNIO |
web |
A comprehesive online suite for
data preprocessing; viewing; hierarchical clustering, SOTA, SOM,
SOM-tree clustering; Pomelo tool for multiple testing for differential
gene identification; SVM; Data Mining with Gene Ontology |
free |
Documentation; |
| J-Express 2.1 | MolMine | Java | Hierarchical clustering, K-means particional clustering, Principal component anlaysis, Self-organizing maps, Profile similarity search, Normalization and filtering, Raw data import, Project organization | Free for academics | Download (registration required); plugins;screenshot; Old Version 1.0 (free for all users); Old version 1.0 Online tutorial |
| MAExplorer | Open Source at the SourceForge | Java; either standalone or run as applet | normalization methods, data filtering, scatter plots,
histograms, expression profile plots, similar gene clustering,
hierarchial clustering, K-means and K-median clustering, gene and
sample
sets, dynamic reports. Direct manipulation. Plugin facility for adding
user analytic methods being alpha-tested. Data conversion wizard.
Direct access to other genomic databases. |
Mozilla
Public License 1.1 (MPL 1.1) |
summary;documentation;manual;pdfs;
tutorial;download; very comprehensive online resources; |
| Partek software suites | Partek | UNIX, Linux, Windows 2000, NT, 98, 95 | comprehensive processing, analysis, and visualization of data; Exploratory Data Analysis; Statistical Inference; Predictive Modeling; connectivity to third-party database, web, and software applications. | no pricing information is available | Screenshot;Demo Available |
| Pathways 4 |
Invitrogen |
Windows, Linux/UNIX, MacOS X |
Visualization, filtering,
statistical inference of differential expression, various clustering
algorithms, extensible by plugins |
no pricing information is available | - |
| ROSETTA |
Knowledge Systems Group,
Dept. of Computer and Information Science, Norwegian University of
Science and Technology |
Windows 2000/XP, Linux |
ROSETTA is a toolkit for
analyzing tabular data within the framework of rough set theory.
ROSETTA is designed to support the overall data mining and knowledge
discovery process: From initial browsing and preprocessing of the data,
via computation of minimal attribute sets and generation of if-then
rules or descriptive patterns, to validation and analysis of the
induced rules or patterns. |
Free? |
source code is also available. Features; Download; Resources; |
| SilicoCyte | CytoGenomics Inc |
Windows 2000/XP |
An integrated microarray
analysis software and includes modules for Image Analysis, Data
Annotation, Data Analysis, Statistical Analysis |
US$999 for academic user |
|
| TIGR Multiple Experiment Viewer (MEV) | The Institute of Genomic Research (TIGR) | Windows, LINUX, UNIX, MacOS X | Java application designed to allow the analysis of microarray data to identify patterns of gene expression and differentially expressed genes. Numerous normalization, clustering and distance algorithms have been implemented, along with a variety of graphical displays to best present the results. MEV was written to be flexible and expandable, and supports a variety of input and output formats. | Free (Open Source |
User guide; Download Program; Download Program and Source; Reference [PubMed] |
| Vector Xpression 3.0 | Informax Inc. | Windows | import, normalize, and merge
primary expression run results
obtained under a variety of experimental conditions. Filtering,
sorting,
clustering algorithms, profiling and plotting methods; Automation
through recordable macros; Internal editor and functionality for R
statistical and Matlab; Multiple Correction Techniques; Two Group
Comparison; 3 or More Group Comparison |
following
this link for pricing information |
Free
trial after registration; demo
video; brochure;
|
| XCluster | Stanford University | Unix/ Linux/ Mac/ Windows | A program similar to Cluster, perform hierarchical clustering, self-organizing maps | Free for academic and nonprofit user | - |
| X-Miner | X-MINE | client based server (Sun Solaris, Linux) or Web. | an integrated suite of Supervised (tree harvesting series
(Classifier, Isolator, Quantifier, Survivor) and the Sam Series (X-Sam Q., X-SamIso., X-SamSurv. X-SamPair) /Unsupervised (hierarchical clustering, SOM, K-Mean, K-Medioid, PCA, and X-Shaving) analytics for microarray data. The platform has four filtering options available for cDNA and affy data, three normalization options and three data imputation options. |
no pricing information is available | Demo available; |