Canopy
|
Simple, fast, accurate function for grouping objects into preliminary clusters. Often used as an initial step in more rigorous clustering techniques, such as k-means. |
Gaussian Mixture Model Functions
|
Fit a Gaussian mixture model (GMM) to input data, using either a
basic GMM algorithm with a fixed number of clusters or a Dirichlet
Process GMM (DP-GMM) algorithm with a variable number of clusters. The
GMM functions are GMMFit, GMMPredict, and GMMProfile. |
KMeans
|
Takes a data set and outputs the centroids of its clusters and, optionally, the clusters themselves. |
KMeansPlot
|
Takes a model—a table of cluster centroids output by the KMeans function—and an input table of test data, and uses the model to assign the test data points to the cluster centroids. |
KModes
|
Extends KMeans to support categorical data. The core algorithm is an expectation-maximization algorithm that finds a locally optimal solution. |
KModesPredict
|
Prediction function that corresponds to KModes. |
Minhash
|
Probabilistic clustering method that assigns a pair of users to the same cluster with probability proportional to the overlap between the sets of items that these users have bought. |