Apache Ignite

GridGain Developer Hub - Apache Ignitetm

Welcome to the Apache Ignite developer hub run by GridGain. Here you'll find comprehensive guides and documentation to help you start working with Apache Ignite as quickly as possible, as well as support if you get stuck.

 

GridGain also provides Community Edition which is a distribution of Apache Ignite made available by GridGain. It is the fastest and easiest way to get started with Apache Ignite. The Community Edition is generally more stable than the Apache Ignite release available from the Apache Ignite website and may contain extra bug fixes and features that have not made it yet into the release on the Apache website.

 

Let's jump right in!

 

Documentation     Ask a Question     Download

 

Javadoc     Scaladoc     Examples

Schema and Indexes

Setting Up Schema and Indexes in Apache Ignite

Overview

An SQL schema is a logical database object that contains tables and their indexes associated with it.

Presently, Apache Ignite allows defining the schema using DDL statements, the annotation based or QueryEntities based approaches.

Moreover, Apache Ignite supports advanced indexing capabilities allowing you to define a single field (aka. column) or group indexes with various parameters, to manage indexes location putting them either in Java heap or off-heap spaces and so on so forth.

Indexes in Ignite are kept in a distributed fashion the same way as cache data sets. Each node that stores a specific subset of data keeps and maintains indexes corresponding to this data as well.

From this documentation page, you'll learn how to define schemas including indexes using the approaches available.

Schema and Tables

If tables and indexes are configured using either annotation based or QueryEntities based approach then the schema name, they belong to, will correspond to a name of the cache of their CacheConfiguration object. To change the schema name, you need to use CacheConfiguration.setSqlSchema method.

At the same time, the schema name will be completely different if the tables and indexes are set up with DDL statements. In this scenario, all the tables and corresponding indexes will end up in PUBLIC schema that is set by default. Presently, there is no way to replace PUBLIC schema with a custom one for this type of configuration. That will be addressed in the upcoming Apache Ignite releases.

In cases where tables are configured with all of the approaches mentioned above, make sure to set the correct schema name for the tables in SQL queries. For instance, assuming that 80% of the tables are configured with DDL, it makes sense to set PUBLIC schema as a query's default one using SqlQuery.setSchema("PUBLIC") method:

IgniteCache cache = ignite.cache("Person");

// Creating City table.
cache.qry(new SqlFieldsQuery("CREATE TABLE City " +
    "(id int primary key, name varchar, region varchar)"));

// Creating Organization table.
cache.qry(new SqlFieldsQuery("CREATE TABLE Organization " +
    "(id int primary key, name varchar, cityName varchar)"));

// Joining data between City, Organizaion and Person tables. The latter
// was created with either annotations or QueryEntity approach.
SqlFieldsQuery qry = new SqlFieldsQuery("SELECT o.name from Organization o " +
    "inner join \"Person\".Person p on o.id = p.orgId " +
    "inner join City c on c.name = o.cityName " +
    "where p.age > 25 and c.region <> 'Texas'");

// Setting the query's default schema to PUBLIC.
// Table names from the query without the schema set will be
// resolved against PUBLIC schema.
// Person table belongs to "Person" schema (person cache) and this is why
// that schema name is set explicitly.
qry.setSchema("PUBLIC");

// Executing the query.
cache.query(qry);

DDL Based Configuration

DDL usage is described in this documentation.

Annotation Based Configuration

Indexes, as well as queryable fields, can be configured from code with the usage of @QuerySqlField annotation. As shown in the example below, desired fields should be marked with this annotation.

public class Person implements Serializable {
  /** Indexed field. Will be visible for SQL engine. */
	@QuerySqlField (index = true)
  private long id;
  
  /** Queryable field. Will be visible for SQL engine. */
  @QuerySqlField
  private String name;
  
  /** Will NOT be visible for SQL engine. */
  private int age;
  
  /**
   * Indexed field sorted in descending order. 
   * Will be visible for SQL engine.
   */
  @QuerySqlField(index = true, descending = true)
  private float salary;
}
case class Person (
  /** Indexed field. Will be visible for SQL engine. */
  @(QuerySqlField @field)(index = true) id: Long,

  /** Queryable field. Will be visible for SQL engine. */
  @(QuerySqlField @field) name: String,
  
  /** Will NOT be visisble for SQL engine. */
  age: Int
  
  /**
   * Indexed field sorted in descending order. 
   * Will be visible for SQL engine.
   */
  @(QuerySqlField @field)(index = true, descending = true) salary: Float
) extends Serializable {
  ...
}

Both id and salary are indexed fields. id field will be sorted in the ascending order (default) while salary in the descending order.

If you don't want to index a field but still need to use it in a SQL query, then the field has to be annotated as well omitting the index = true parameter. Such a field is called as a queryable field. As an example, name is defined as a queryable field above.

Finally, age is neither queryable nor indexed field and won't be accessible from SQL queries in Apache Ignite.

Scala Annotations

In Scala classes, the @QuerySqlField annotation must be accompanied by the @field annotation in order for a field to be visible for Ignite, like so: @(QuerySqlField @field).

Alternatively, you can also use the @ScalarCacheQuerySqlField annotation from the ignite-scalar module which is just a type alias for the @field annotation.

Registering Indexed Types

After indexed and queryable fields are defined, they have to be registered in the SQL engine along with the object types they belong to.

To tell Ignite which types should be indexed, key-value pairs can be passed into CacheConfiguration.setIndexedTypes method as it's shown in the example below.

// Preparing configuration.
CacheConfiguration<Long, Person> ccfg = new CacheConfiguration<>();

// Registering indexed type.
ccfg.setIndexedTypes(Long.class, Person.class);

Note that this method accepts only pairs of types - one for key class and another for value class. Primitives are passed as boxed types.

Predefined Fields

In addition to all the fields marked with @QuerySqlField annotation, each table will have two special predefined fields: _key and _val, which represent links to whole key and value objects. This is useful, for instance, when one of them is of a primitive type and you want to filter out by its value. To do this, execute a query like SELECT * FROM Person WHERE _key = 100.

Since Ignite supports Binary Marshaller, there is no need to add classes of indexed types to the classpath of cluster nodes. SQL query engine is able to pick up values of indexed and queryable fields avoiding object deserialization.

Group Indexes

To set up a multi-field index that will allow accelerating queries with complex conditions, you can use @QuerySqlField.Group annotation. It is possible to put multiple @QuerySqlField.Group annotations into orderedGroups if you want a field to be a part of more than one group.

For instance, in Person class below we have field age which belongs to an indexed group named "age_salary_idx" with group order 0 and descending sort order. Also, in the same group, we have field salary with group order 3 and ascending sort order. Furthermore, field salary itself is a single column index (there is index = true parameter specified in addition to orderedGroups declaration). Group order does not have to be a particular number. It is needed just to sort fields inside of a particular group.

public class Person implements Serializable {
  /** Indexed in a group index with "salary". */
  @QuerySqlField(orderedGroups={@QuerySqlField.Group(
    name = "age_salary_idx", order = 0, descending = true)})
  private int age;

  /** Indexed separately and in a group index with "age". */
  @QuerySqlField(index = true, orderedGroups={@QuerySqlField.Group(
    name = "age_salary_idx", order = 3)})
  private double salary;
}

Note that annotating a field with @QuerySqlField.Group outside of @QuerySqlField(orderedGroups={...}) will have no effect.

QueryEntity Based Configuration

Indexes and queryable fields can also be configured with org.apache.ignite.cache.QueryEntity class which is convenient for Spring XML based configuration.

All concepts that are discussed as a part of annotation based configuration above are valid for QueryEntity based approach. Furthermore, types whose fields are configured with @QuerySqlField and are registered with CacheConfiguration.setIndexedTypes method are internally turned into query entities.

The example below shows how you can define a single field and group indexes as well as queryable fields.

<bean class="org.apache.ignite.configuration.CacheConfiguration">
    <property name="name" value="mycache"/>
    <!-- Configure query entities -->
    <property name="queryEntities">
        <list>
            <bean class="org.apache.ignite.cache.QueryEntity">
                <!-- Setting indexed type's key class -->
                <property name="keyType" value="java.lang.Long"/>
              
                <!-- Setting indexed type's value class -->
                <property name="valueType"
                          value="org.apache.ignite.examples.Person"/>

                <!-- Defining fields that will be either indexed or queryable.
                Indexed fields are added to 'indexes' list below.-->
                <property name="fields">
                    <map>
                        <entry key="id" value="java.lang.Long"/>
                        <entry key="name" value="java.lang.String"/>
                        <entry key="salary" value="java.lang.Long "/>
                    </map>
                </property>

                <!-- Defining indexed fields.-->
                <property name="indexes">
                    <list>
                        <!-- Single field (aka. column) index -->
                        <bean class="org.apache.ignite.cache.QueryIndex">
                            <constructor-arg value="id"/>
                        </bean>
                      
                        <!-- Group index. -->
                        <bean class="org.apache.ignite.cache.QueryIndex">
                            <constructor-arg>
                                <list>
                                    <value>id</value>
                                    <value>salary</value>
                                </list>
                            </constructor-arg>
                            <constructor-arg value="SORTED"/>
                        </bean>
                    </list>
                </property>
            </bean>
        </list>
    </property>
</bean>

Indexes Tradeoffs

There are multiple things you should consider when choosing indexes for your Ignite application.

  • Indexes are not free. They consume memory, also each index needs to be updated separately, thus your cache update performance can be poorer when you have more indexes set up. On top of that, the optimizer might do more mistakes by choosing a wrong index to run a query.

It is a bad strategy to index everything!

  • Indexes are just sorted data structures. If you define an index for the fields (a,b,c) then the records will be sorted first by a, then by b and only then by c.

Example of Sorted Index

| A | B | C |
| 1 | 2 | 3 |
| 1 | 4 | 2 |
| 1 | 4 | 4 |
| 2 | 3 | 5 |
| 2 | 4 | 4 |
| 2 | 4 | 5 |

Any condition like a = 1 and b > 3 can be viewed as a bounded range, both bounds can be quickly looked up in in log(N) time, the result will be everything between.

The following conditions will be able to use the index:

  • a = ?
  • a = ? and b = ?
  • a = ? and b = ? and c = ?

Condition a = ? and c = ? is no better than a = ? from the index point of view.
Obviously half-bounded ranges like a > ? can be used as well.

  • Indexes on single fields are no better than group indexes on multiple fields starting with the same field (index on (a) is no better than (a,b,c)). Thus it is preferable to use group indexes.

Schema and Indexes

Setting Up Schema and Indexes in Apache Ignite