Apache Ignite Documentation

GridGain Developer Hub - Apache Ignitetm

Welcome to the Apache Ignite developer hub run by GridGain. Here you'll find comprehensive guides and documentation to help you start working with Apache Ignite as quickly as possible, as well as support if you get stuck.

 

GridGain also provides Community Edition which is a distribution of Apache Ignite made available by GridGain. It is the fastest and easiest way to get started with Apache Ignite. The Community Edition is generally more stable than the Apache Ignite release available from the Apache Ignite website and may contain extra bug fixes and features that have not made it yet into the release on the Apache website.

 

Let's jump right in!

 

Documentation     Ask a Question     Download

 

Javadoc     Scaladoc     Examples

SVM Binary Classification

Support Vector Machines (SVMs) are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis.

Given a set of training examples, each marked as belonging to one or the other of two categories, an SVM training algorithm builds a model that assigns new examples to one category or the other, making it a non-probabilistic binary linear classifier.

Only Linear SVM is supported in the Apache Ignite Machine Learning module. For more information look at SVM in Wikipedia.

Model

A Model in the case of SVM is represented by the class SVMLinearBinaryClassificationModel. It enables a prediction to be made for a given vector of features, in the following way:

SVMLinearBinaryClassificationModel model = ...;

double prediction = model.predict(observation);

Presently Ignite supports a few parameters for SVMLinearBinaryClassificationModel:

  • isKeepingRawLabels - controls the output label format: -1 and +1 for false value and raw distances from the separating hyperplane (default value: false)
  • threshold - a threshold to assign +1 label to the observation if the raw value is more than this threshold (default value: 0.0)
SVMLinearBinaryClassificationModel model = ...;

double prediction = model
  .withRawLabels(true)
  .withThreshold(5)
  .predict(observation);

Trainer

Base class for a soft-margin SVM linear classification trainer based on the communication-efficient distributed dual coordinate ascent algorithm (CoCoA) with hinge-loss function. This trainer takes input as a Labeled Dataset with -1 and +1 labels for two classes and makes binary classification.

The paper about this algorithm could be found here https://arxiv.org/abs/1409.1458.

Presently, Ignite supports the following parameters for SVMLinearBinaryClassificationTrainer:

  • amountOfIterations - amount of outer SDCA algorithm iterations. (default value: 200)
  • amountOfLocIterations - amount of local SDCA algorithm iterations. (default value: 100)
  • lambda - regularization parameter (default value: 0.4)
// Set up the trainer
SVMLinearBinaryClassificationTrainer trainer = new SVMLinearBinaryClassificationTrainer()
  .withAmountOfIterations(AMOUNT_OF_ITERATIONS)
  .withAmountOfLocIterations(AMOUNT_OF_LOC_ITERATIONS)
  .withLambda(LAMBDA);

// Build the model
SVMLinearBinaryClassificationModel mdl = trainer.fit(
  datasetBuilder,
  featureExtractor,
  labelExtractor
);

Example

To see how SVM Linear Classifier can be used in practice, try this example that is available on GitHub and delivered with every Apache Ignite distribution.

The training dataset is the subset of the Iris dataset (classes with labels 1 and 2, which are presented linear separable two-classes dataset) which could be loaded from the UCI Machine Learning Repository.

SVM Binary Classification


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.