Apache Ignite Documentation

GridGain Developer Hub - Apache Ignitetm

Welcome to the Apache Ignite developer hub run by GridGain. Here you'll find comprehensive guides and documentation to help you start working with Apache Ignite as quickly as possible, as well as support if you get stuck.

 

GridGain also provides Community Edition which is a distribution of Apache Ignite made available by GridGain. It is the fastest and easiest way to get started with Apache Ignite. The Community Edition is generally more stable than the Apache Ignite release available from the Apache Ignite website and may contain extra bug fixes and features that have not made it yet into the release on the Apache website.

 

Let's jump right in!

 

Documentation     Ask a Question     Download

 

Javadoc     Scaladoc     Examples

Machine Learning

Overview

Apache Ignite 2.0 release introduced first version of its own distributed Machine Learning (ML) library called ML Grid.

The rationale for building ML Grid is quite simple. Many users employ Ignite as the central high-performance storage and processing systems for various data sets. If they wanted to perform ML or Deep Learning (DL) on these data sets (i.e training sets or model inference) they had to ETL them first into some other systems like Apache Mahout or Apache Spark.

That resulted in two significant drawbacks:

  • It introduced a costly ETL step that was extremely time confusing and ensured that ML/DL can only be run on outdated data set, and
  • secondly it eliminated Ignite's core co-located distributed processing resulting in much slower overall processing.

ML Grid is the first step in resolving these two issues. Ignite ML Grid will allow users run ML/DL training and inference directly on the data stored in Ignite Data Grid and will provide ML and DL algorithms that are specifically optimized for Ignite's co-located distributed processing resulting in extreme high-performance ML/DL on the live up-to-date data.

The roadmap for ML Grid is starting with core algebra implementation based on Ignite co-located distributed processing. The initial version of that was released with Ignite 2.0. Future releases will introduce custom DSLs for Python, R and Scala, growing collection of optimized ML algorithms such as Linear and Logistic Regression, Decision Tree/Random Forest, SVM, Naive Bayes, as well support for Ignite-optimized Neural Networks and integration with TensorFlow.

Current beta version of Apache Ignite Machine Learning Grid (ML Grid) supports a distributed machine learning library built on top of highly optimized and scalable Apache Ignite platform and implements local and distributed vector and matrix algebra operations as well as distributed versions of widely used algorithms.

Getting Started

The fastest way to get started with the ML Grid is to build and run the examples, study their output and code. ML examples are located in the examples folder of the Apache Ignite distribution. Here is a direct GitHub link to them.

Follow the steps below to try out the examples:

  1. Make sure you're using Java 8 or later.
  2. Download Apache Ignite of version 2.0 or later.
  3. Open examples project in an IDE like IntelliJ IDEA or Eclipse.
  4. Activate ml Maven profile when setting up the project.
  5. Go to src\main\ml folder in the IDE and run an ML Grid example.

The examples do not require any special configuration. All ML Grid examples are supposed to launch, run and stop successfully without any user intervention and provide meaningful output on the console. Additionally, an example for the Tracer API is supposed to launch a web browser and provide some HTML output.

Build From Sources

The latest Apache Ignite ML Grid jar is uploaded to the Maven repository. If you need to take the jar and deploy it in a custom environment, then it can be either downloaded from Maven or built from scratch. To build ML Grid from sources:

  1. Download the latest Apache Ignite source release.
  2. Clean local Maven repo (this is to ensure that older Maven builds don’t impact my check).
  3. Make sure you're using Java 8 or later.
  4. Build and install Apache Ignite Data Fabric from the project's root directory:
mvn clean install -DskipTests -Dmaven.javadoc.skip=true -P java8
  1. Build and install ML Grid from the project's root directory:
  mvn install -Pml -DskipTests -U -pl modules/ml -am
  1. Locate the ML Grid jar in your local Maven repository under the path {user_dir}/.m2/repository/org/apache/ignite/ignite-ml/{ignite-version}/ignite-ml-{ignite-version}.jar.

  2. If you need to build the ML Grid examples from sources execute the commands below:

cd examples
mvn clean package -DskipTests -Pml

If needed, refer to DEVNOTES.txt in the project's root folder and README files in the ignite-ml component for more details.

Machine Learning