APCluster - An R Package for Affinity Propagation Clustering

In order to make Affinity Propagation Clustering [Frey & Dueck, 2007] accessible to a wider audience in bioinformatics, we ported the Matlab code published by the authors Frey and Dueck (cf. Affinity Propagation Website) to R.

Installation

The package is available through CRAN - The Comprehensive R Archive Network (click here to view the archive entry of the package). Therefore, the simplest way to install the package is to enter
install.packages("apcluster")
into your R session. If, for what reason ever, you prefer to install the package manually, follow the instructions in the user manual.

The current version of the package is 1.4.1 (released 2014-12-09).

Documentation

  1. User Manual: PDF (1.2MB; last updated on 2014-12-09)
  2. Reference Manual: PDF (253KB; last updated on 2014-12-09; generated automatically from package help pages)

Webinar "Introduction to apcluster"

On June 13, 2013, the maintainer of the package, Ulrich Bodenhofer, gave a webinar on the apcluster package. The webinar was hosted by the Orange County User Group and moderated by its president, Ray DiGiacomo, Jr. The demo uses Version 1.3.2 of the package (released June 11, 2013).
  1. Complete recording of webinar: YouTube Video (length: 59:17)
  2. Slides of webinar: PDF (6.6MB)
  3. R code used in webinar: R code (4KB)

Getting started

  1. To load the package, enter "library(apcluster)" in your R session.
  2. To view the user manual, enter "vignette("apcluster")".
  3. To do a first example, enter "example(apcluster)".

Citing this package

If you use this package for research that is published later, you are kindly asked to cite it as follows:

U. Bodenhofer, A. Kothmeier, and S. Hochreiter (2011). APCluster: an R package for affinity propagation clustering. Bioinformatics 27:2463-2464. DOI: 10.1093/bioinformatics/btr406.

Moreover, we insist that, any time you use/cite the package, you also cite the original paper in which affinity propagation has been introduced:

B. J. Frey and D. Dueck (2007). Clustering by passing messages between data points. Science, 315:972-976. DOI: 10.1126/science.1136800.

Change log

Version 1.4.1 released 2014-12-09:
  • fixes in C++ code of sparse affinity propagation
Version 1.4.0 released 2014-12-01:
  • added apcluster() method for sparse similarity matrices; as a consequence, the package now imports the Matrix package and is now also able to handle non-sparse matrix classes defined by the Matrix package. Moreover, similarity functions supplied to the apcluster() method may now also return any matrix type defined by the Matrix package.
  • fix of apcluster() for dense matrices to better support -Inf similarities
  • added apclusterK() method for sparse similarity matrices
  • preferenceRange() is now an S4 generic; re-implementation in C++ to speed up function; changed handling of -Inf similarities for consistency with sparse version
  • added preferenceRange() methods for sparse matrices and dense matrix objects from the Matrix package
  • new conversion methods implemented for converting dense similarity matrices to sparse ones and vice versa; consequently, sparseToFull() is marked as deprecated.
  • adaptation of heatmap() function for improved handling of -Inf similarities
  • adaptations of signatures of '[' and '[[' accessor methods
  • renamed help page of methods for computing similarity matrices to 'similarities' in order to avoid confusion with the accessor method similarity()
  • corresponding updates of help pages and vignette
Version 1.3.5 released 2014-06-26:
  • memory access fixes in C++ code called from apclusterL()
  • minor updates of vignette
Version 1.3.4 released 2013-06-25:
  • added sort() function to rearrange clusters according to sort criterion; note that this is an S3 method (see help page for explanation)
  • improvements and bug fixes of apclusterL() method for signature 'matrix,missing'
  • performance optimizations of apcluster() and apclusterL()
  • plotting of clustering results superimposed in scatter plot matrices now also works for AggExResult objects
  • improvements of consistency of error and warning messages
  • according adaptations of documentation and vignette
  • adapted dependency and linking to Rcpp version 0.11.1 (to avoid issues on Mac OS)
  • minor correction of package namespace
Version 1.3.3 released 2014-02-21:
  • adapted dependencies and linking to Rcpp version 0.11.0
  • cleared up package dependencies
Version 1.3.2, released 2013-06-11:
  • plotting of clustering results extended to data sets with more than two dimensions (resulting in the clustering result being superimposed in a scatterplot matrix); the variant that plot() can be used to draw a heatmap has been removed. From now on, heatmap() must always be used.
  • improved NA handling
  • correction of input check in apcluster() and apclusterL() (previously, both functions issued a warning whenever argument p had length > 1)
  • corresponding updates and further improvements of help pages and vignette
Version 1.3.1, released 2013-04-22:
  • re-implementation of heatmap() method: dendrograms can now be plotted even for APResult and ExClust objects as well as for cluster hierarchies based on prior clusterings; color bars can now be switched off and colors can be changed by user (by new sideColor argument); dendrograms can be switched on and off (by Rowv and Colv arguments);
  • added as.hclust() and as.dendrogram() methods
  • added new arguments base, showSamples, and horiz to the plot() method
  • with signature (x="AggExResult", y="missing"); moreover, parameters for changing the appearance of the height axis are now respected as well
  • streamlining of methods (redundant definition of inherited methods removed)
  • various minor improvements of code and documentation
Version 1.3.0, released 2013-01-07:
  • added Leveraged Affinity Propagation Clustering
  • re-implementation of main functions as S4 generic methods in order to facilitate the convenient internal computation of similarity matrices
  • for convenience, similarity matrices can be stored as part of clustering results
  • heatmap plotting now done by heatmap() which has been defined as S4 generic
  • extended interface to functions for computing similarity matrices
  • added function corSimMat()
  • implementation of length() method for classes APResult, AggExResult, and ExClust
  • added accessor function to extract clustering levels from AggExResult objects
  • correction of exemplars returned by apcluster() for details=TRUE in slot idxAll of returned APResult object
  • when using data stored in a data frame, now categorical columns are explicitly omitted, thereby, avoiding warnings
  • plotting of clustering results along with original data (2D only) has been accelerated
  • all clustering methods now store their calls into the result objects
  • updates and extensions of help pages and vignette
Version 1.2.1, released 2012-06-12:
  • added convenient accessor functions to extracting cluster indices from APResult and ExClust objects
  • added a function for coercing an APResult object into an ExClust object
  • correction of color bar on the left side of heatmaps (default behavior of RowSideColors parameter changed with R 2.15)
Version 1.2.0, released 2012-03-26:
  • reimplementation of apcluster() in C++ using the Rcpp package which reduces computation times by a factor of 9-10
  • obsolete function apclusterLM() removed
  • updates of help pages and vignette
Version 1.1.1, released 2011-09-08:
  • updated citation
  • minor corrections in help pages and vignette
Version 1.1.0, released 2011-06-15:
  • exemplar-based agglomerative clustering (function aggExCluster()) added
  • added various plotting functions, e.g. for dendrograms and heatmaps
  • added sequence analysis example to vignette
  • extension of vignette according to new functionality
  • re-organization of variable names in vignette
  • added option 'verbose' to apclusterK()
  • numerous minor corrections in help pages and vignette
Version 1.0.3, released 2011-03-01:
  • Makefile in inst/doc eliminated to avoid installation problems
  • renamed vignette to "apcluster"
Version 1.0.2, released 2010-03-19:
  • replacement of computation of responsibilities and availabilities in apcluster() by pure matrix operations; traditional implementation according to Frey and Dueck still available as function apclusterLM();
  • improved support for named objects
  • new function for computing label vectors
  • re-organization of package source files and help pages
Version 1.0.1, released 2010-03-02:
first public release

Contact

For suggestions, bug reports, and other matters regarding the package, please contact apcluster@bioinf.jku.at.