A user-defined distance metric is used in this example. The Java class com.example.MyDistance defines this metric:
package com.example; import com.asterdata.ncluster.sqlmr.data.RowView; import com.asterdata.sqlmr.analytics.classification.knn.distance.Distance; public class MyDistance implements Distance { /** * calculate the distance between the test row and the training row. * note: 1.don't reverse the sequence of parameters * 2. the columns of trainingRowView is 'responseColumn, f1,f2,...,fn' * 3. the columns of testRowView is the same as TEST_TABLE * 4. all the trainingRowView and testRowView is zero-based * (0 <= index && index < getColumnCount()) * * @param testRowView * stands for a point in the test data set * @param trainingRowView * stands for a point in the training data set, the columns is the * columns in distanceFeatures argument * @return the double value of distance */ @Override public double calculate(RowView testRowView, RowView trainingRowView) { return Math.abs(testRowView.getIntAt(1) - trainingRowView.getIntAt(1)); } }