Apache Ignite Documentation

GridGain Developer Hub - Apache Ignitetm

Welcome to the Apache Ignite developer hub run by GridGain. Here you'll find comprehensive guides and documentation to help you start working with Apache Ignite as quickly as possible, as well as support if you get stuck.

 

GridGain also provides Community Edition which is a distribution of Apache Ignite made available by GridGain. It is the fastest and easiest way to get started with Apache Ignite. The Community Edition is generally more stable than the Apache Ignite release available from the Apache Ignite website and may contain extra bug fixes and features that have not made it yet into the release on the Apache website.

 

Let's jump right in!

 

Documentation     Ask a Question     Download

 

Javadoc     Scaladoc     Examples

Deep Learning With TensorFlow

TensorFlow is an open source software library for high-performance numerical computation that is used mostly for deep learning and other computationally intensive machine learning tasks. Its flexible architecture allows easy deployment of computation across a variety of platforms (CPUs, GPUs, TPUs). Originally developed by researchers and engineers from the Google Brain team within Google’s AI organization, it comes with strong support for machine learning and deep learning and the flexible numerical computation core is used across many other scientific domains.

Apache Ignite is a memory-centric distributed database, caching, and processing platform for transactional, analytical, and streaming workloads delivering in-memory speeds at petabyte scale.

TensorFlow and Apache Ignite together provide a full toolset needed to work with operational and historical data, to perform data analysis and to build complex mathematical models based on neural networks.

Ignite Dataset

Ignite Dataset represents an integration between Apache Ignite and TensorFlow that allows Apache Ignite to be used as a data source for neural network training, inference and all other computations supported by TensorFlow. Using Ignite Dataset has many advantages, including:

  • TensorFlow obtains fast access to a distributed database that can contain training data and data for inference.
  • Objects fed by Ignite Dataset can have any structure, thus all preprocessing can be done in the TensorFlow pipeline.
  • SSL, Windows and distributed training are also supported.

For now, Ignite Dataset is a part of TensorFlow, so there is no need to install any third-party packages and you can use it out of the box. The integration is based on tf.data from the TensorFlow side and Binary Client Protocol from the Apache Ignite side.

Apache Ignite can be used as a data source for neural network training, inference and all other computations supported by TensorFlow.

IGFS plugin

In addition to the database functionality, Apache Ignite provides a distributed file system called IGFS. IGFS delivers functionality similar to Hadoop HDFS, but only in-memory.

The integration is based on custom filesystem plugin from the TensorFlow side and IGFS Native API from the Apache Ignite side. It has many uses, for example:

  • Checkpoints of state can be saved to IGFS for reliability and fault-tolerance.
  • Training processes communicate with TensorBoard by writing event files to a directory, which TensorBoard watches. IGFS allows this communication to work even when TensorBoard runs in a different process or machine.

IGFS plugin state

At present, the IGFS plugin is not a part of Tensorflow. For the current state of TensorFlow, please follow this pull request