Apache Ignite Documentation

GridGain Developer Hub - Apache Ignitetm

Welcome to the Apache Ignite developer hub run by GridGain. Here you'll find comprehensive guides and documentation to help you start working with Apache Ignite as quickly as possible, as well as support if you get stuck.

 

GridGain also provides Community Edition which is a distribution of Apache Ignite made available by GridGain. It is the fastest and easiest way to get started with Apache Ignite. The Community Edition is generally more stable than the Apache Ignite release available from the Apache Ignite website and may contain extra bug fixes and features that have not made it yet into the release on the Apache website.

 

Let's jump right in!

 

Documentation     Ask a Question     Download

 

Javadoc     Scaladoc     Examples

Capacity Planning

Techniques that can help plan and identify the minimum hardware requirements for a given deployment

Overview

When preparing and planning for a system capacity planning is an integral part of the design. Understanding the size of the cache that will be required will help decide how much physical memory, how many JVMs, and how many CPUs and servers will be required. In this section we discuss various techniques that can help plan and identify the minimum hardware requirements for a given deployment.

Calculating Memory Usage

  • Calculate primary data size: multiply the size of one entry in bytes by the total number of entries
  • If you have backups, multiply by their number
  • Indexes also require memory. Basic use cases will add a 30% increase
  • Add around 20MB per cache. (This value can be reduced if you explicitly set IgniteSystemProperties.IGNITE_ATOMIC_CACHE_DELETE_HISTORY_SIZE to a smaller value than default.)
  • Add around 200-300MB per node for internal memory and reasonable amount of memory for JVM and GC to operate efficiently

📘

Apache Ignite will typically add around 200 bytes overhead to each entry.

Memory Capacity Planning Example

Let's take for example the following scenario:

👍

Example Specification

  • 2,000,000 objects
  • 1,024 bytes per object (1 KB)
  • 1 backup
  • 4 nodes
  • Total number of objects X object size X 2 (one primary and one backup copy for each object):
    2,000,000 x 1,024 x 2 = 4,096,000,000 bytes

  • Considering indexes:
    4,096,000,000 + (4,096,000,000 x 30%) = 5,078 MB

  • Approximate additional memory required by the platform:
    300 MB x 4 = 1,200 MB

  • Total size:
    5,078 + 1,200 = 6,278 MB

Hence the anticipated total memory consumption would be just over ~ 6 GB

Calculating Compute Usage

Calculating compute is generally much harder to estimate without some code already in place. It is important to understand the cost of a given operation that your application will be performing and multiply this by the number of operations expected at various times. A good starting point for this would be the Ignite benchmarks which detail the results of standard operations and give a rough estimate of the capacity required to deliver such performance.

With 32 cores over 4 large AWS instances the following benchmarks were recorded:

  • PUT/GET: 26k/sec
  • PUT (TRANSACTIONAL): 68k/sec
  • PUT (TRANSACTIONAL - PESSIMISTIC): 20k/sec
  • PUT (TRANSACTIONAL - OPTIMISTIC): 44k/sec
  • SQL Query: 72k/sec

See more results here

Calculating Disk Space Usage

When Ignite is used with Native Persistence enabled, you need to provide enough disk space for each node to accommodate the data required for proper operation. The data includes your application data converted to Ignite's internal format plus auxiliary data such as indexes, WAL files, etc.

The total amount of the required space can be estimated as follows (for partitioned caches):

  • Your data size * 2.5~3 (the total amount, it will be distributed among the nodes depending on your cache configuration). If backups are enabled, the backup partitions will take as much space as the total amount of data, so multiply this value by number of backups + 1. Divide the resulting number by the number of nodes to get an approximate per node value.
  • WAL size per node (10 segments * segment size; defaults to 640 MB).
  • WAL Archive size per node (either the configured value or 4 times the checkpointing buffer size). See the Write-Ahead Log page for details.

The following table displays how the default maximum WAL archive size is calculated depending on the available RAM (provided that no settings are specified for the data region size, checkpointing buffer size and WAL archive size). These are the values per node.

RAM < 5GB

5 <= RAM < 40GB

RAM > 40GB

4 x MIN(RAM/5, 256MB)

RAM / 5

8GB

Example

👍

Use the following spreadsheet to estimate how many servers with a predefined configuration you may need in your cluster.
https://docs.google.com/spreadsheets/d/e/2PACX-1vS5HpEpqqf93jtfsDKSi2dj3fpKqxF-W3-6e0wQ8407hrXeoa79jdlWkZiSrKCur_9uC4-ceFsoN_tb/pub?output=xlsx

Capacity Planning FAQ

- I have 300GB of data in DB. Will this be the same in Ignite?

No, data size on disk is not a direct 1-to-1 mapping in memory. As a very rough estimate, it can be about 2.5/3 times size on disk excluding indexes and any other overhead. To get a more accurate estimation, you need to figure out the average object size by importing a record into Ignite and multiply by the number of objects expected.

Updated less than a minute ago

Capacity Planning


Techniques that can help plan and identify the minimum hardware requirements for a given deployment

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.