Apache Ignite

GridGain Developer Hub - Apache Ignitetm

Welcome to the Apache Ignite developer hub run by GridGain. Here you'll find comprehensive guides and documentation to help you start working with Apache Ignite as quickly as possible, as well as support if you get stuck.

 

GridGain also provides Community Edition which is a distribution of Apache Ignite made available by GridGain. It is the fastest and easiest way to get started with Apache Ignite. The Community Edition is generally more stable than the Apache Ignite release available from the Apache Ignite website and may contain extra bug fixes and features that have not made it yet into the release on the Apache website.

 

Let's jump right in!

 

Documentation     Ask a Question     Download

 

Javadoc     Scaladoc     Examples

Fault Tolerance

Automatically fail-over jobs to other nodes in case of a crash.

Overview

Ignite supports automatic job failover. In case of a node crash, jobs are automatically transferred to other available nodes for re-execution. However, in Ignite you can also treat any job result as a failure as well. The worker node can still be alive, but it may be running low on CPU, I/O, disk space, etc. There are many conditions that may result in a failure within your application and you can trigger a failover. Moreover, you have the ability to choose the node a job should be failed over to, as it could be different for different applications or different computations within the same application.

The FailoverSpi is responsible for handling the selection of a new node for the execution of a failed job. FailoverSpi inspects the failed job and the list of all available grid nodes on which the job execution can be retried. It ensures that the job is not re-mapped to the same node it had failed on. Failover is triggered when the method ComputeTask.result(...) returns the ComputeJobResultPolicy.FAILOVER policy. Ignite comes with a number of built-in customizable Failover SPI implementations.

At Least Once Guarantee

As long as there is at least one node standing, no job will ever be lost.

By default, Ignite will failover all jobs from stopped or crashed nodes automatically. For custom failover behavior, you should implement ComputeTask.result() method. The example below triggers failover whenever a job throws any IgniteException (or its subclasses):

public class MyComputeTask extends ComputeTaskSplitAdapter<String, String> {
    ...
      
    @Override 
    public ComputeJobResultPolicy result(ComputeJobResult res, List<ComputeJobResult> rcvd) {
        IgniteException err = res.getException();
     
        if (err != null)
            return ComputeJobResultPolicy.FAILOVER;
    
        // If there is no exception, wait for all job results.
        return ComputeJobResultPolicy.WAIT;
    }
  
    ...
}

Closure Failover

Closure failover is by default governed by ComputeTaskAdapter, which is triggered if a remote node either crashes or rejects closure execution. This default behavior may be overridden by using IgniteCompute.withNoFailover() method, which creates an instance of IgniteCompute with a no-failover flag set on it . Here is an example:

IgniteCompute compute = ignite.compute().withNoFailover();

compute.apply(() -> {
    // Do something
    ...
}, "Some argument");

AlwaysFailOverSpi

Ignite splits a task into jobs and assigns them to multiple nodes for faster processing. In case of a node failure, AlwaysFailoverSpi always reroutes a failed job to another node. First, an attempt will be made to reroute the failed job to a node that has not executed any other job from the same task. If no such node is available, then an attempt will be made to reroute the failed job to one of the nodes that may be running other jobs from the same task. If none of the above attempts succeed, then the job will not be failed over and null will be returned.

The following configuration parameters can be used to configure AlwaysFailoverSpi.

Setter Method
Description
Default

setMaximumFailoverAttempts(int)

Sets the maximum number of attempts to fail-over a failed job to other nodes.

5

<bean id="grid.custom.cfg" class="org.apache.ignite.IgniteConfiguration" singleton="true">
  ...
  <bean class="org.apache.ignite.spi.failover.always.AlwaysFailoverSpi">
    <property name="maximumFailoverAttempts" value="5"/>
  </bean>
  ...
</bean>
AlwaysFailoverSpi failSpi = new AlwaysFailoverSpi();
 
IgniteConfiguration cfg = new IgniteConfiguration();
 
// Override maximum failover attempts.
failSpi.setMaximumFailoverAttempts(5);
 
// Override the default failover SPI.
cfg.setFailoverSpi(failSpi);
 
// Start Ignite node.
Ignition.start(cfg);

Fault Tolerance

Automatically fail-over jobs to other nodes in case of a crash.