Apache Ignite Documentation

GridGain Developer Hub - Apache Ignitetm

Welcome to the Apache Ignite developer hub run by GridGain. Here you'll find comprehensive guides and documentation to help you start working with Apache Ignite as quickly as possible, as well as support if you get stuck.

 

GridGain also provides Community Edition which is a distribution of Apache Ignite made available by GridGain. It is the fastest and easiest way to get started with Apache Ignite. The Community Edition is generally more stable than the Apache Ignite release available from the Apache Ignite website and may contain extra bug fixes and features that have not made it yet into the release on the Apache website.

 

Let's jump right in!

 

Documentation     Ask a Question     Download

 

Javadoc     Scaladoc     Examples

Baseline Topology

Overview

If Ignite persistence is enabled, Ignite enforces the baseline topology concept which represents a set of server nodes in the cluster that will persist data on disk.

Usually, when the cluster is started for the first time with Ignite persistence on, the cluster will be considered inactive disallowing any CRUD operations. For example, if​ you try to execute a SQL or key-value operation, an exception will be thrown, as shown in the picture below:

This is done to avoid possible performance, usability and consistency issues that might occur if the cluster is being restarted and applications start modifying the data that may be persisted on the nodes that have not yet rejoined the cluster. Therefore, it's required to define the baseline topology for the cluster with Ignite persistence enabled, and after that, the topology can be either maintained manually or automatically by Ignite. Let's dive into the details of this concept and see how to work with it.

What is Baseline Topology

A baseline topology is a set of Ignite server nodes intended to store data both in memory and in Ignite persistence. The nodes from the baseline topology are not limited in terms of functionality, and behave as regular server nodes that act as a container for data and computations in Ignite.

Moreover, the cluster can have cluster nodes that are not a part of the baseline topology such as:

  • Server nodes that either store data in memory or persist it to a 3rd party database like RDBMS or NoSQL.
  • Client nodes that are never included in the baseline topology because they don't store a part of shared data.

The goals of the baseline topology are to:

  • Avoid unnecessary data rebalancing if a node is being rebooted. For instance, every node restart triggers two rebalancing events. The first happens when the node is brought down and the second occurs when the node is added back to the cluster. This an example of ineffective usage of cluster resources.​
  • Activate a cluster automatically once all the nodes of the baseline topology have joined after a cluster restart.

Baseline Topology Deep-Dive

Refer to this Ignite wiki page if you want to learn more details about the baseline topology and automatic cluster activation.

Note that the baseline topology is not set when the cluster, with Ignite persistence enabled, is started for the first time; that's the only time when a manual intervention is needed. So, once all the nodes that should belong to the baseline topology are up and running, you can set the topology from code or the command line tool, and let Ignite handle the automatic cluster activation routines going forward.

The same tools and APIs can be used to adjust the baseline topology throughout the cluster lifetime. It's required if you decide to scale out or scale in an existing topology by setting more or fewer nodes that will store the data. The sections below show how to use the APIs and tools.

Setting Baseline Topology

Baseline topology can only be set on an active cluster. You must first activate the cluster, and then set the baseline topology.

Activating the Cluster

To achieve automatic activation for an Ignite persistence enabled cluster, you need to manually activate the cluster the first time. This can be done in 4 ways — via code, command line, REST API, or 3rd party tools — as explained in the sections below.

When the cluster is activated for the first time, baseline topology is automatically established from the current set of server nodes. Once this is done, information about nodes that constitute the baseline topology is persisted to disk. Then, even if you shutdown and restart the cluster, the cluster will be activated automatically once all the nodes set in the baseline topology are up and running.

// Connect to the cluster.
Ignite ignite = Ignition.start();

// Activate the cluster. Automatic topology initialization occurs 
// only if you manually activate the cluster for the very first time. 
ignite.cluster().active(true);
## Run this command from your `$IGNITE_HOME/bin` folder
bin/control.sh --activate
## Run this command from your `$IGNITE_HOME/bin` folder
bin\control.bat --activate
## Replace [host] and [port] with actual values.

https://[host]:[port]/ignite?cmd=activate 

For more information on how to active/deactivate a cluster using the REST API, refer to this documentation.

Setting the Topology From Code

As explained above, baseline topology is automatically initialized when you activate the cluster manually. Use the IgniteCluster.activate() method to activate the cluster from code. Then, you can use the IgniteCluster.setBaseLineTopogy() method to adjust an existing baseline topology. Note that the cluster must be activated for the method to be called.

// Connect to the cluster.
Ignite ignite = Ignition.start();

// Activate the cluster.
// This is required only if the cluster is still inactive.
ignite.cluster().active(true);

// Get all server nodes that are already up and running.
Collection<ClusterNode> nodes = ignite.cluster().forServers().nodes();

// Set the baseline topology that is represented by these nodes.
ignite.cluster().setBaselineTopology(nodes);
// Connect to the cluster.
Ignite ignite = Ignition.start();

// Activate the cluster.
// This is required only if the cluster is still inactive.
ignite.cluster().active(true)

// Set the baseline topology to a specific Ignite cluster topology version.
ignite.cluster().setBaselineTopology(2);

If you update the baseline topology later, let's say by adding new nodes to it, then Ignite will rebalance the data automatically across all the baseline nodes.

Setting the Topology From Command Line

You can use the cluster activation tool to set the baseline topology as well as activate or deactivate the cluster from the command line.

Getting Node Consistent ID

The commands that define and adjust the baseline topology require providing a node consistent ID which is a unique ID assigned to the node when it is first launched and reused between the node's restarts. To get consistent IDs of presently running nodes, run the ./control.sh --baseline command from your $IGNITE_HOME/bin folder to get information about the cluster baseline topology. For example:

bin/control.sh --baseline
bin\control.bat --baseline

The output will look something like this:

Cluster state: active
Current topology version: 4

Baseline nodes:
    ConsistentID=cde228a8-da90-4b74-87c0-882d9c5729b7, STATE=ONLINE
    ConsistentID=dea120d9-3157-464f-a1ea-c8702de45c7b, STATE=ONLINE
    ConsistentID=fdf68f13-8f1c-4524-9102-ac2f5937c62c, STATE=ONLINE
--------------------------------------------------------------------------------
Number of baseline nodes: 3

Other nodes:
    ConsistentID=5d782f5e-0d47-4f42-aed9-3b7edeb527c0

The above baseline information shows: the cluster state, topology version, and nodes with their consistent IDs that are part of the baseline topology as well as those that are not part of the baseline topology.

Setting the topology

To form the baseline topology from a set of nodes, use the ./control.sh --baseline set command along with a list of the nodes' consistent IDs:

bin\control.sh --baseline set consistentId1[,consistentId2,....,consistentIdN]
bin\control.bat --baseline set {consistentId1[,consistentId2,....,consistentIdN]}

Alternatively, you can use the numerical cluster topology version to set the baseline:

bin/control.sh --baseline version topologyVersion
bin\control.bat --baseline version {topologyVersion}

In the above command, replace topologyVersion with the actual topology version.

Adding nodes to the topology

To add nodes to the existing baseline topology, you can use the ./control.sh --baseline add command that accepts a comma-separated​ list of consistent IDs of the nodes you would like to have in the topology:

bin/control.sh --baseline add consistentId1[,consistentId2,....,consistentIdN]
bin\control.bat --baseline add {consistentId1[,consistentId2,....,consistentIdN]}

For instance, the command below adds the node with consistent ID 5d782f5e-0d47-4f42-aed9-3b7edeb527c0 to the topology:

bin/control.sh --baseline add 5d782f5e-0d47-4f42-aed9-3b7edeb527c0
bin\control.bat --baseline add eb05ce3d-f246-4b7b-8e80-91155774c20b

Removing nodes from the topology

To remove the nodes from the baseline, use the ./control.sh --baseline remove command that has the following syntax:

bin/control.sh --baseline remove consistentId1[,consistentId2,....,consistentIdN]
bin\control.bat --baseline remove {consistentId1[,consistentId2,....,consistentIdN]}

Note that the nodes that are planned for removal have to be stopped first. Otherwise, you will get an exception like Failed to remove nodes from baseline. The example below shows how to remove the node with the consistent ID fdf68f13-8f1c-4524-9102-ac2f5937c62c (assuming that the node has already been stopped):

bin/control.sh --baseline remove fdf68f13-8f1c-4524-9102-ac2f5937c62c
bin\control.bat --baseline remove eb05ce3d-f246-4b7b-8e80-91155774c20b

Cluster Activation Command Line Tool

Ignite ships a control.sh|bat script, located in the $IGNITE_HOME/bin folder, that acts like a tool to activate or deactivate the cluster from the command line. This tool can also be used to set the baseline topology. The following commands can be used with control.sh|bat:

Command
Description

--activate

Activate the cluster.

--deactivate

Deactivate the cluster.

--host {ip}

IP address of the cluster. Default value is 127.0.0.1

--port {port}

Port number to connect to. Default value is 11211

--state

Print the current cluster state.

--baseline

When used without any parameters, this
command prints the cluster baseline topology information.

The following parameters can be used
with this command: add, remove, set, and version

--baseline add

Add nodes to the baseline topology.

--baseline remove

Remove nodes from the baseline topology.

--baseline set

Set the baseline topology.

--baseline version

Set baseline topology based on the version.

For help, use ./control.sh --help.

Activate Using 3rd Party Tools

To activate the cluster and set the baseline topology, you can also use 3rd party tools such as this one.

Usage Scenarios

One of the goals of the baseline topology is to avoid unnecessary partitions reassignments and data rebalancing. This means that it becomes a responsibility of IT administrators to decide when to adjust the baseline topology (and, thus, trigger data rebalancing) in response to events when a node is added or removed from the cluster. As described above, it can be done manually via the control.sh|bat command or with some external automation (e.g.,​ a system that does a health check of the hosts running Ignite nodes and removes unstable or failed nodes from the baseline topology).

Rebalancing and Non-Baseline Server Nodes

The rebalancing of data stored solely in memory (or in memory and in a 3rd party persistence) is triggered automatically and doesn't require any manual intervention. ​

Let's review several common scenarios of baseline topology administration when Ignite persistence is used.

First Cluster Startup

Scenario: a cluster is started for the first time, and there is no data preloaded there.

Steps:

  1. Start all nodes using the way you like, e.g. via ignite.sh. At this point:
    • The cluster is inactive and can't handle any client data-related requests (key-value, SQL, scan queries, etc).
    • The baseline topology is not set.
  2. Activate the cluster by calling control.sh --activate or use another approach discussed earlier in the documentation. This specific approach does two things:
    • Adds all currently running server nodes to the baseline topology.
    • Moves the cluster to the active state so it can be interacted with.

The control.sh --activate command only sets the baseline topology during the first-time activation. If the cluster has the baseline topology already set, then this command has no effect.

After the cluster is activated, it will remain active until manually deactivated or until all the nodes are stopped. The cluster doesn't get disabled​ in the event of single node removal.

Restarting a Cluster

Scenario: all cluster nodes have to be restarted (e.g., to perform a system or hardware upgrade).

Steps:

  1. Stop all cluster nodes gracefully.
  2. Do an upgrade, update, etc.
  3. Start all the nodes. Once all the cluster nodes from the baseline topology are booted, the cluster will activate automatically.

Upgrading a Cluster with New Ignite Version When Using Persistence

Scenario: all cluster nodes have to be restarted with new Ignite version and data must be loaded from Ignite persistence.

Steps:

  1. Stop all cluster nodes gracefully.
  2. Configure the storagePath, walPath and walArchivePath properties. If these properties were explicitly configured in the current version of Ignite you are using, then provide the same value of these properties to the configuration of the new Ignite version. If these properties were not set in the current version you are using, then provide the default location used by these properties.
  3. Copy binary_meta and marshaller folders from {current_Ignite_version}/work directory to {new_Ignite_version}/work directory, or if you have configured IgniteConfiguration.workDirectory with some path in the current Ignite version, then configure this property with same path in the new version.
  4. Start all the nodes. Once all the cluster nodes from the baseline topology are booted, the cluster will activate automatically.

Adding a New Node

Scenario: a new and empty node (no data in Ignite persistence) is being added.

Steps:

  1. Start the node normally. At this point:
    • The cluster remains active.
    • The new node joins the cluster, but it is not added to the baseline topology.
    • The new node can be used to store data of caches/tables who do not use Ignite persistence.
    • The new node cannot hold data of caches/tables who persist data in Ignite persistence.
    • The new node can be used from computation standpoint (Ignite compute grid).
  2. If you want the node to hold data that will be stored in Ignite persistence, then add it to the baseline topology by calling control.sh --baseline add <node's consistentId> or by other means. After that:
    • The baseline topology will be adjusted to include the new node.
    • The data will be rebalanced between the nodes within the baseline topology.

Restarting a New Node

Scenario: a node needs to be restarted. The downtime will be short.

Steps:

  1. Stop the node.
  2. Start the node again. After that:
    • No baseline topology-related changes are required. The node preserves its consistentId after the restart. So the cluster and the baseline topology just takes the node back.
    • If during the node's downtime some data was updated, the data from modified partitions will be copied to the restarted node from the others.

If a node is restarted with a short downtime, it is safe (and more efficient) not to touch the baseline topology, thus, not triggering the rebalancing.

However, if the downtime is expected to be long enough, then it's reasonable to remove the node from the baseline topology and trigger the rebalancing to avoid possible data loss that might occur if you lose both primary and backup copies of data stored on different nodes that went down.

Removing Baseline Node

Scenario: a baseline node needs to be stopped, or has experienced a critical failure, and needs to be removed from the cluster permanently, or is expected to have a long downtime.

Steps:

  1. Stop the node (if it isn't stopped/failed already). At this point:
    • The cluster remains active.
    • The stopped node remains in the baseline topology in the disconnected state.
    • Data stored in pure in-memory caches/tables (or caches/tables with a 3rd party persistence enabled) will be rebalanced.
    • Data stored in caches/tables with Ignite persistence on won't be rebalanced. A replication factor (a number of backup copies) will remain decreased until the stopped node comes back online or is removed from baseline topology.

Having a decreased replication factor for a long time may be dangerous. E.g., consider a cache with one backup. If one node fails, no data will be lost yet (as one copy is still online) but the rebalancing needs to be triggered as soon as possible because if another node fails before the rebalancing is complete, it may lead to data loss.

  1. Remove the node from the baseline topology by calling contol.sh --baseline remove <node's consistentId> or by other means. After that:
    • The baseline topology will be changed excluding the stopped node.
    • Caches/tables with Ignite persistence enabled will be rebalanced satisfying a configured replication factor (aka. a number of backups).

After a node is removed from the baseline topology, it will not be able to join the cluster and keep using the data it stored in Ignite persistence before the removal. By removing a node from the baseline topology, you acknowledge that you will no longer be able to use the data stored on that node after its restart​.

Removing Non-Baseline Node and Failure Handling

Server nodes that are not added to the baseline topology don't need extra intervention from IT administrators to trigger data rebalancing. Since such nodes store data only in memory (or in memory and a 3rd party database), the rebalancing will be triggered automatically between them.

Triggering Rebalancing Programmatically

As described in the node restart and removal sections, it might happen that the baseline topology will stay with a decreased replication factor (aka. a number of backups) for a long period of time. So, if the downtime is expected to be long enough, then it's reasonable to remove the node from the baseline topology and trigger rebalancing to avoid possible data loss that might occur if you lose both primary and backup copies of data stored on different nodes that went down.

You can trigger rebalancing and recover the replication factor only after the node is manually removed from the baseline topology. It's possible to automate the removal by relying on the control.sh script and monitoring tools. Alternatively, the following template can be used to trigger the rebalancing programmatically:

import java.util.Set;
import java.util.stream.Collectors;
import org.apache.ignite.Ignite;
import org.apache.ignite.cluster.BaselineNode;
import org.apache.ignite.cluster.ClusterNode;
import org.apache.ignite.events.DiscoveryEvent;
import org.apache.ignite.events.EventType;
import org.apache.ignite.internal.IgniteEx;
import org.apache.ignite.internal.processors.timeout.GridTimeoutObjectAdapter;

public class BaselineWatcher {
    /** Ignite. */
    private final IgniteEx ignite;

    /** BLT change delay millis. */
    private final long bltChangeDelayMillis;

    /**
     * @param ignite Ignite.
     */
    public BaselineWatcher(Ignite ignite, long bltChangeDelayMillis) {
        this.ignite = (IgniteEx)ignite;
        this.bltChangeDelayMillis = bltChangeDelayMillis;
    }

    /**
     *
     */
    public void start() {
        ignite.events().localListen(event -> {
            DiscoveryEvent e = (DiscoveryEvent)event;

            Set<Object> aliveSrvNodes = e.topologyNodes().stream()
                .filter(n -> !n.isClient())
                .map(ClusterNode::consistentId)
                .collect(Collectors.toSet());

            Set<Object> baseline = ignite.cluster().currentBaselineTopology().stream()
                .map(BaselineNode::consistentId)
                .collect(Collectors.toSet());

            final long topVer = e.topologyVersion();

            if (!aliveSrvNodes.equals(baseline))
                ignite.context().timeout().addTimeoutObject(new GridTimeoutObjectAdapter(bltChangeDelayMillis) {
                    @Override public void onTimeout() {
                        if (ignite.cluster().topologyVersion() == topVer)
                            ignite.cluster().setBaselineTopology(topVer);
                    }
                });

            return true;
        }, EventType.EVT_NODE_FAILED, EventType.EVT_NODE_LEFT, EventType.EVT_NODE_JOINED);
    }
}

Baseline Topology


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.