The Apache Ignite Machine Learning component provides two versions of the widely used k-NN (k-nearest neighbors) algorithm - one for classification tasks and the other for regression tasks.
This documentation reviews k-NN as a solution for classification tasks.
The k-NN algorithm is a non-parametric method whose input consists of the k-closest training examples in the feature space.
Also, k-NN classification's output represents a class membership. An object is classified by the majority votes of its neighbors. The object is assigned to a particular class that is most common among its k nearest neighbors.
k is a positive integer, typically small. There is a special case when
1, then the object is simply assigned to the class of that single nearest neighbor.
Presently, Ignite supports a few parameters for k-NN classification algorithm:
k- a number of nearest neighbors
distanceMeasure- one of the distance metrics provided by the ML framework such as Euclidean, Hamming or Manhattan
KNNStrategy- could be SIMPLE or WEIGHTED (it enables a weighted k-NN algorithm)
dataCache- holds a training set of objects for which the class is already known.
// Create trainer KNNClassificationTrainer trainer = new KNNClassificationTrainer(); // Train model. KNNClassificationModel knnMdl = trainer.fit( ignite, dataCache, (k, v) -> Arrays.copyOfRange(v, 0, v.length - 1), (k, v) -> v ) .withK(3) .withDistanceMeasure(new EuclideanDistance()) .withStrategy(KNNStrategy.SIMPLE); // Make a prediction. double prediction = knnMdl.apply(vectorizedData);