These examples use the "iris" data set (gmm_iris_input). The data has values for four attributes—sepal_length, sepal_width, petal_length, and petal_width—which are the data dimensions. The input does not include the species column, because the goal is data clustering, not classification. Each example outputs three clusters.
From the raw data, a train set and a test set are created.
The function GMMFit uses the train set to generate the model. The GMMPredict function uses the model information to predict clusters for the test data.
id | sepal_length | sepal_width | petal_length | petal_width |
---|---|---|---|---|
1 | 5.1 | 3.5 | 1.4 | 0.2 |
2 | 4.9 | 3 | 1.4 | 0.2 |
3 | 4.7 | 3.2 | 1.3 | 0.2 |
4 | 4.6 | 3.1 | 1.5 | 0.2 |
5 | 5 | 3.6 | 1.4 | 0.2 |
6 | 5.4 | 3.9 | 1.7 | 0.4 |
7 | 4.6 | 3.4 | 1.4 | 0.3 |
8 | 5 | 3.4 | 1.5 | 0.2 |
9 | 4.4 | 2.9 | 1.4 | 0.2 |
10 | 4.9 | 3.1 | 1.5 | 0.1 |
... | ... | ... | ... | ... |