Apache Ignite Documentation

GridGain Developer Hub - Apache Ignitetm

Welcome to the Apache Ignite developer hub run by GridGain. Here you'll find comprehensive guides and documentation to help you start working with Apache Ignite as quickly as possible, as well as support if you get stuck.

 

GridGain also provides Community Edition which is a distribution of Apache Ignite made available by GridGain. It is the fastest and easiest way to get started with Apache Ignite. The Community Edition is generally more stable than the Apache Ignite release available from the Apache Ignite website and may contain extra bug fixes and features that have not made it yet into the release on the Apache website.

 

Let's jump right in!

 

Documentation     Ask a Question     Download

 

Javadoc     Scaladoc     Examples

k-NN Regression

The Apache Ignite Machine Learning component provides two versions of the widely used k-NN (k-nearest neighbors) algorithm - one for classification tasks and the other for regression tasks.

This documentation reviews k-NN as a solution for the regression tasks.

Model description

The k-NN algorithm is a non-parametric method whose input consists of the k-closest training examples in the feature space. Each training example has a property value in a numerical form associated with the given training example.

The k-NN algorithm uses all training sets to predict a property value for the given test sample.
This predicted property value is an average of the values of its k nearest neighbors. If k is 1, then the test sample is simply assigned to the property value of a single nearest neighbor.

Presently, Ignite supports a few parameters for the k-NN regression algorithm:

  • k - a number of nearest neighbors
  • distanceMeasure - one of the distance metrics provided by the ML framework such as Euclidean, Hamming or Manhattan
  • KNNStrategy - could be SIMPLE or WEIGHTED (it enables a weighted k-NN algorithm)
  • datasetBuilder - helps to get access to the training set of objects for which the class is already known.
// Create trainer
KNNRegressionTrainer trainer = new KNNRegressionTrainer();

// Train model.
KNNRegressionModel knnMdl = (KNNRegressionModel) trainer.fit(
      datasetBuilder,
      (k, v) -> Arrays.copyOfRange(v, 1, v.length),
      (k, v) -> v[0])
  .withK(5)
  .withDistanceMeasure(new ManhattanDistance())
  .withStrategy(KNNStrategy.WEIGHTED);

// Make a prediction. 
double prediction = knnMdl.apply(vectorizedData);

Example

To see how the k-NN regression can be used in practice, try this example that is available on GitHub and delivered with every Apache Ignite distribution.

The training dataset is the Computer Hardware Data Set which can be loaded from the UCI Machine Learning Repository.

k-NN Regression


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.