Apache Ignite

GridGain Developer Hub - Apache Ignitetm

Welcome to the Apache Ignite developer hub run by GridGain. Here you'll find comprehensive guides and documentation to help you start working with Apache Ignite as quickly as possible, as well as support if you get stuck.

 

GridGain also provides Community Edition which is a distribution of Apache Ignite made available by GridGain. It is the fastest and easiest way to get started with Apache Ignite. The Community Edition is generally more stable than the Apache Ignite release available from the Apache Ignite website and may contain extra bug fixes and features that have not made it yet into the release on the Apache website.

 

Let's jump right in!

 

Documentation     Ask a Question     Download

 

Javadoc     Scaladoc     Examples

Schema and Indexes

Setting Up Schema and Indexes in Apache Ignite

Overview

Presently, Apache Ignite allows defining a schema using the annotation based or QueryEntities based approaches. Every specific schema is bound to an Ignite cache which name is used as the schema name in SQL queries by default.

Data Definition Language Support

DDL statements are planned to be supported in the nearest Apache Ignite releases. Having this feature done it will be feasible to define schemas, caches, indexes and manage them by standard SQL commands like CREATE/ALTER/DROP TABLE or CREATE/DROP INDEX.

Apache Ignite supports advanced indexing capabilities allowing you to define a single field (aka. column) or group indexes with various parameters, to manage indexes location putting them either in Java heap or off-heap spaces and so on so forth.

Indexes in Ignite are kept in a distributed fashion the same way as cache data sets. Each node that stores a specific subset of data keeps and maintains indexes corresponding to this data as well.

From this documentation page, you'll learn how to define schemas including indexes as well as queryable fields using two available approaches and how to switch between specific indexing implementations supported by data fabric.

DDL Based Configuration

Use widely used DDL commands to create and alter indexes in runtime referring to this documentation.

Annotation Based Configuration

Indexes, as well as queryable fields, can be configured from code with the usage of @QuerySqlField annotation. As shown in the example below, desired fields should be marked with this annotation.

public class Person implements Serializable {
  /** Indexed field. Will be visible for SQL engine. */
	@QuerySqlField (index = true)
  private long id;
  
  /** Queryable field. Will be visible for SQL engine. */
  @QuerySqlField
  private String name;
  
  /** Will NOT be visible for SQL engine. */
  private int age;
  
  /**
   * Indexed field sorted in descending order. 
   * Will be visible for SQL engine.
   */
  @QuerySqlField(index = true, descending = true)
  private float salary;
}
case class Person (
  /** Indexed field. Will be visible for SQL engine. */
  @(QuerySqlField @field)(index = true) id: Long,

  /** Queryable field. Will be visible for SQL engine. */
  @(QuerySqlField @field) name: String,
  
  /** Will NOT be visisble for SQL engine. */
  age: Int
  
  /**
   * Indexed field sorted in descending order. 
   * Will be visible for SQL engine.
   */
  @(QuerySqlField @field)(index = true, descending = true) salary: Float
) extends Serializable {
  ...
}

Both id and salary are indexed fields. id field will be sorted in the ascending order (default) while salary in the descending order.

If you don't want to index a field but still need to use it in a SQL query, then the field has to be annotated as well omitting the index = true parameter. Such a field is called as a queryable field. As an example, name is defined as a queryable field above.

Finally, age is neither queryable nor indexed field and won't be accessible from SQL queries in Apache Ignite.

Scala Annotations

In Scala classes, the @QuerySqlField annotation must be accompanied by the @field annotation in order for a field to be visible for Ignite, like so: @(QuerySqlField @field).

Alternatively, you can also use the @ScalarCacheQuerySqlField annotation from the ignite-scalar module which is just a type alias for the @field annotation.

Registering Indexed Types

After indexed and queryable fields are defined, they have to be registered in the SQL engine along with the object types they belong to.

To tell Ignite which types should be indexed, key-value pairs can be passed into CacheConfiguration.setIndexedTypes method as it's shown in the example below.

// Preparing configuration.
CacheConfiguration<Long, Person> ccfg = new CacheConfiguration<>();

// Registering indexed type.
ccfg.setIndexedTypes(Long.class, Person.class);

Note that this method accepts only pairs of types - one for key class and another for value class. Primitives are passed as boxed types.

Predefined Fields

In addition to all the fields marked with @QuerySqlField annotation, each table will have two special predefined fields: _key and _val, which represent links to whole key and value objects. This is useful, for instance, when one of them is of a primitive type and you want to filter out by its value. To do this, execute a query like SELECT * FROM Person WHERE _key = 100.

Since Ignite supports Binary Marshaller, there is no need to add classes of indexed types to the classpath of cluster nodes. SQL query engine is able to pick up values of indexed and queryable fields avoiding object deserialization.

Group Indexes

To set up a multi-field index that will allow accelerating queries with complex conditions, you can use @QuerySqlField.Group annotation. It is possible to put multiple @QuerySqlField.Group annotations into orderedGroups if you want a field to be a part of more than one group.

For instance, in Person class below we have field age which belongs to an indexed group named "age_salary_idx" with group order 0 and descending sort order. Also, in the same group, we have field salary with group order 3 and ascending sort order. Furthermore, field salary itself is a single column index (there is index = true parameter specified in addition to orderedGroups declaration). Group order does not have to be a particular number. It is needed just to sort fields inside of a particular group.

public class Person implements Serializable {
  /** Indexed in a group index with "salary". */
  @QuerySqlField(orderedGroups={@QuerySqlField.Group(
    name = "age_salary_idx", order = 0, descending = true)})
  private int age;

  /** Indexed separately and in a group index with "age". */
  @QuerySqlField(index = true, orderedGroups={@QuerySqlField.Group(
    name = "age_salary_idx", order = 3)})
  private double salary;
}

Note that annotating a field with @QuerySqlField.Group outside of @QuerySqlField(orderedGroups={...}) will have no effect.

QueryEntity Based Configuration

Indexes and queryable fields can also be configured with org.apache.ignite.cache.QueryEntity class which is convenient for Spring XML based configuration.

All concepts that are discussed as a part of annotation based configuration above are valid for QueryEntity based approach. Furthermore, types whose fields are configured with @QuerySqlField and are registered with CacheConfiguration.setIndexedTypes method are internally turned into query entities.

The example below shows how you can define a single field and group indexes as well as queryable fields.

<bean class="org.apache.ignite.configuration.CacheConfiguration">
    <property name="name" value="mycache"/>
    <!-- Configure query entities -->
    <property name="queryEntities">
        <list>
            <bean class="org.apache.ignite.cache.QueryEntity">
                <!-- Setting indexed type's key class -->
                <property name="keyType" value="java.lang.Long"/>
              
                <!-- Setting indexed type's value class -->
                <property name="valueType"
                          value="org.apache.ignite.examples.Person"/>

                <!-- Defining fields that will be either indexed or queryable.
                Indexed fields are added to 'indexes' list below.-->
                <property name="fields">
                    <map>
                        <entry key="id" value="java.lang.Long"/>
                        <entry key="name" value="java.lang.String"/>
                        <entry key="salary" value="java.lang.Long "/>
                    </map>
                </property>

                <!-- Defining indexed fields.-->
                <property name="indexes">
                    <list>
                        <!-- Single field (aka. column) index -->
                        <bean class="org.apache.ignite.cache.QueryIndex">
                            <constructor-arg value="id"/>
                        </bean>
                      
                        <!-- Group index. -->
                        <bean class="org.apache.ignite.cache.QueryIndex">
                            <constructor-arg>
                                <list>
                                    <value>id</value>
                                    <value>salary</value>
                                </list>
                            </constructor-arg>
                            <constructor-arg value="SORTED"/>
                        </bean>
                    </list>
                </property>
            </bean>
        </list>
    </property>
</bean>

Indexes Tradeoffs

There are multiple things you should consider when choosing indexes for your Ignite application.

  • Indexes are not free. They consume memory, also each index needs to be updated separately, thus your cache update performance can be poorer when you have more indexes set up. On top of that, the optimizer might do more mistakes by choosing a wrong index to run a query.

It is a bad strategy to index everything!

  • Indexes are just sorted data structures. If you define an index for the fields (a,b,c) then the records will be sorted first by a, then by b and only then by c.

Example of Sorted Index

| A | B | C |
| 1 | 2 | 3 |
| 1 | 4 | 2 |
| 1 | 4 | 4 |
| 2 | 3 | 5 |
| 2 | 4 | 4 |
| 2 | 4 | 5 |

Any condition like a = 1 and b > 3 can be viewed as a bounded range, both bounds can be quickly looked up in in log(N) time, the result will be everything between.

The following conditions will be able to use the index:

  • a = ?
  • a = ? and b = ?
  • a = ? and b = ? and c = ?

Condition a = ? and c = ? is no better than a = ? from the index point of view.
Obviously half-bounded ranges like a > ? can be used as well.

  • Indexes on single fields are no better than group indexes on multiple fields starting with the same field (index on (a) is no better than (a,b,c)). Thus it is preferable to use group indexes.

Schema and Indexes

Setting Up Schema and Indexes in Apache Ignite