Apache Ignite Documentation

GridGain Developer Hub - Apache Ignitetm

Welcome to the Apache Ignite developer hub run by GridGain. Here you'll find comprehensive guides and documentation to help you start working with Apache Ignite as quickly as possible, as well as support if you get stuck.

 

GridGain also provides Community Edition which is a distribution of Apache Ignite made available by GridGain. It is the fastest and easiest way to get started with Apache Ignite. The Community Edition is generally more stable than the Apache Ignite release available from the Apache Ignite website and may contain extra bug fixes and features that have not made it yet into the release on the Apache website.

 

Let's jump right in!

 

Documentation     Ask a Question     Download

 

Javadoc     Scaladoc     Examples

Affinity Collocation

Collocate compute with data or data with data to improve performance and scalability of your application.

Collocate Data with Data

In many cases it is beneficial to collocate different cache keys together if they will be accessed together. Quite often your business logic will require access to more than one cache key. By collocating them together you can make sure that all keys with the same affinityKey will be cached on the same processing node, hence avoiding costly network trips to fetch data from remote nodes.

For example, let's say you have Person and Company objects and you want to collocate Person objects with Company objects for which the person works.

To achieve that, the cache key used to cache Person objects should have a field annotated with @AffinityKeyMapped, which provides the value of the company key for collocation. For convenience, you can also optionally use the AffinityKey class.

When using SQL for caches and creating SQL tables, define the affinity key via the AFFINITY_KEY parameter of the CREATE TABLE command.

Annotations in Scala

Note that if Scala case class is used as a key class and one of its constructor parameters is annotated with @AffinityKeyMapped, by default the annotation will not be properly applied to the generated field, and therefore will not be recognized by Ignite. To override this behavior, use @field meta annotation in addition to @AffinityKeyMapped (see example below).

public class PersonKey {
    // Person ID used to identify a person.
    private String personId;
 
    // Company ID which will be used for affinity.
    @AffinityKeyMapped
    private String companyId;
    ...
}

// Instantiate person keys with the same company ID which is used as affinity key.
Object personKey1 = new PersonKey("myPersonId1", "myCompanyId");
Object personKey2 = new PersonKey("myPersonId2", "myCompanyId");
 
Person p1 = new Person(personKey1, ...);
Person p2 = new Person(personKey2, ...);
 
// Both, the company and the person objects will be cached on the same node.
comCache.put("myCompanyId", new Company(...));
perCache.put(personKey1, p1);
perCache.put(personKey2, p2);
case class PersonKey (
    // Person ID used to identify a person.
    personId: String,
 
    // Company ID which will be used for affinity.
    @(AffinityKeyMapped @field)
    companyId: String
)

// Instantiate person keys with the same company ID which is used as affinity key.
val personKey1 = PersonKey("myPersonId1", "myCompanyId");
val personKey2 = PersonKey("myPersonId2", "myCompanyId");
 
val p1 = new Person(personKey1, ...);
val p2 = new Person(personKey2, ...);
 
// Both, the company and the person objects will be cached on the same node.
compCache.put("myCompanyId", Company(...));
perCache.put(personKey1, p1);
perCache.put(personKey2, p2);
Object personKey1 = new AffinityKey("myPersonId1", "myCompanyId");
Object personKey2 = new AffinityKey("myPersonId2", "myCompanyId");
 
Person p1 = new Person(personKey1, ...);
Person p2 = new Person(personKey2, ...);
 
// Both, the company and the person objects will be cached on the same node.
comCache.put("myCompanyId", new Company(..));
perCache.put(personKey1, p1);
perCache.put(personKey2, p2);

SQL Joins

When performing SQL distributed joins over data residing in partitioned caches, you must make sure that the join-keys are collocated.

Collocating Compute with Data

It is also possible to route computations to the nodes where the data is cached. This concept is known as Collocation of Computations and Data. It can route whole units of work to a specific node.

To collocate compute with data you should use the IgniteCompute.affinityRun(...) and IgniteCompute.affinityCall(...) methods.

Here is how you can collocate your computation with the same cluster node on which the company and persons objects from the example above are cached.

String companyId = "myCompanyId";
 
// Execute Runnable on the node where the key is cached.
ignite.compute().affinityRun("myCache", companyId, () -> {
  Company company = cache.get(companyId);

  // Since we collocated persons with the company in the above example,
  // access to the persons objects is local.
  Person person1 = cache.get(personKey1);
  Person person2 = cache.get(personKey2);
  ...  
});
final String companyId = "myCompanyId";
 
// Execute Runnable on the node where the key is cached.
ignite.compute().affinityRun("myCache", companyId, new IgniteRunnable() {
  @Override public void run() {
    Company company = cache.get(companyId);
    
    Person person1 = cache.get(personKey1);
    Person person2 = cache.get(personKey2);
    ...
  }
};

IgniteCompute vs EntryProcessor

Both the IgniteCompute.affinityRun(...) and IgniteCache.invoke(...) methods provide the ability to collocate compute and data. The main difference is that the invoke(...) method is atomic and executes while holding a lock on a key. You should not access other keys from within the EntryProcessor logic as it may cause a deadlock.

affinityRun(...) and affinityCall(...), on the other hand, do not hold any locks. For example, you can start multiple transactions or execute cache queries from these methods without worrying about deadlocks. In this case Ignite will automatically detect that the processing is collocated and will employ a light-weight 1-Phase-Commit optimization for transactions (instead of 2-Phase-Commit).

See JCache EntryProcessor documentation for more information about the IgniteCache.invoke(...) method.

Affinity Function

The "affinity" of a partition controls which grid node or nodes a partition will be cached on. AffinityFunction is a pluggable API used to determine an ideal mapping of partitions to nodes in the grid. When cluster topology changes, the partition-to-node mapping may become sub-optimal (different from an ideal distribution provided by the affinity function) until rebalancing is completed.

Ignite is shipped with RendezvousAffinityFunction which allows a bit of discrepancy in partition-to-node mapping (i.e. some nodes may be responsible for a slightly larger number of partitions than others), however, it guarantees that when topology changes, partitions are migrated only to a joined node or only from a left node. No data exchange will happen between existing nodes in a cluster.

Note that the cache affinity function does not directly map keys to nodes. Rather, it maps keys to "partitions". A partition is simply a number from a limited set (0 to 1024 by default). After the keys are mapped to their partitions (i.e. they get their partition numbers), the existing partition-to-nodes mapping is used for current topology version. Key-to-partition mappings become invalid if they change.

The code snippet below shows how to customize and set an affinity function:

<bean class="org.apache.ignite.configuration.IgniteConfiguration">
    <property name="cacheConfiguration">
        <list>
            <!-- Creating a cache configuration. -->
            <bean class="org.apache.ignite.configuration.CacheConfiguration">
                <property name="name" value="myCache"/>

                <!-- Creating the affinity function with custom setting. -->
                <property name="affinity">
                    <bean class="org.apache.ignite.cache.affinity.rendezvous.RendezvousAffinityFunction">
                        <property name="excludeNeighbors" value="true"/>
                        <property name="partitions" value="2048"/>
                    </bean>
                </property>
            </bean>
        </list>
    </property>
</bean>
// Preparing Apache Ignite node configuration.
IgniteConfiguration cfg = new IgniteConfiguration();
        
// Creating a cache configuration.
CacheConfiguration cacheCfg = new CacheConfiguration("myCache");

// Creating the affinity function with custom setting.
RendezvousAffinityFunction affFunc = new RendezvousAffinityFunction();
        
affFunc.setExcludeNeighbors(true);
        
affFunc.setPartitions(2048);

// Applying the affinity function configuration.
cacheCfg.setAffinity(affFunc);
        
// Setting the cache configuration.
cfg.setCacheConfiguration(cacheCfg);

AffinityFunction is a pluggable API so you can provide your own implementation of the function. The 3 main methods of AffinityFunction API are:

  • partitions() - Gets the total number of partitions for a cache. Cannot be changed while the cluster is up.
  • partition(...) - Given a key, this method determines which partition a key belongs to. The mapping must not change over time.
  • assignPartitions(...) - This method is called every time a cluster topology changes. This method returns a partition-to-node mapping for the given cluster topology.

Crash-Safe Affinity

Try to arrange partitions in a cluster in such a way that primary and backup copies are not located on the same physical machine. To ensure this does not happen, set the excludeNeighbors flag in RendezvousAffinityFunction.

Sometimes it is also useful to have primary and backup copies of a partition on different racks. In this case, you can assign a specific attribute to each node and then use the AffinityBackupFilter property of RendezvousAffinityFunction to exclude nodes from the same rack that are candidates for backup copy assignment.

Additionally, ClusterNodeAttributeAffinityBackupFilter can be configured to create an AffinityBackupFilter which separates primary and backup copies based on the nodes' values for some environment variable, e.g., AVAILABILITY_ZONE. For information on how to configure this property, see the ClusterNodeAttributeAffinityBackupFilter.java class in java doc.

Affinity Key Mapper

CacheAffinityKeyMapper is a pluggable API responsible for getting an affinity key for a cache key. Usually the cache key itself is used for affinity, however sometimes it is important to change the affinity of a cache key in order to collocate it with other cache keys.

The main method of CacheAffinityKeyMapper is affinityKey(key) which returns affinityKey for a cache key. By default, Ignite will look for any field or method annotated with @CacheAffinityKeyMapped annotation. If such a field or method is not found, then the cache key itself is used for affinity. If such a field or method is found, then the value of that field or method will be returned from CacheAffinityKeyMapper.affinityKey(key)method. This allows you to specify an alternate affinity key, other than the cache key itself, whenever needed.

Affinity Collocation


Collocate compute with data or data with data to improve performance and scalability of your application.

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.