Apache Ignite performance and throughput vastly depends on the features and settings you use. In almost any use case, the cache performance can be optimized by simply tweaking the cache configuration.
Ignite has a rich event system to notify users about various events, including cache modification, eviction, compaction, topology changes, and a lot more. Since thousands of events per second are generated, it creates an additional load on the system. This can lead to significant performance degradation. Therefore, it is highly recommended that you enable only those events that your application logic requires. By default, event notifications are disabled.
<bean class="org.apache.ignite.configuration.IgniteConfiguration"> ... <!-- Enable only some events and leave other ones disabled. --> <property name="includeEventTypes"> <list> <util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_STARTED"/> <util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_FINISHED"/> <util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_FAILED"/> </list> </property> ... </bean>
Ignite uses several thread pools which size is calculated as
max(8, total number of cores) by default. This default suits for most of the use cases resulting in few context switches and exploiting CPU caches more efficiently. However, if you are expecting that your jobs will block for I/O or any other reason, it may make sense to increase the size of specific pools. The example below shows how to change the size of public and system pools:
<bean class="org.apache.ignite.configuration.IgniteConfiguration"> ... <!-- Configure internal thread pool. --> <property name="publicThreadPoolSize" value="64"/> <!-- Configure system thread pool. --> <property name="systemThreadPoolSize" value="32"/> ... </bean>
Ignite enables you to execute MapReduce computations in memory. However, most computations usually work on some data which is cached on remote grid nodes. Loading that data from remote nodes is very expensive in most cases. It is a lot cheaper to send the computation to the node where the data resides. The easiest way to do it is to use the
IgniteCompute.affinityRun() method or the
@CacheAffinityMapped annotation. There are other ways, including
Affinity.mapKeysToNodes() methods. The topic of collocated computations is covered in much detail in the Affinity Collocation section which contains proper code examples.
If you need to upload lots of data into cache, use
IgniteDataStreamer to do it. Data streamer will properly batch the updates before sending them to remote nodes and will properly control the number of parallel operations taking place on each node to avoid thrashing. It provides 10x better performance than doing a bunch of single-threaded updates. See Data Loading section for more details and examples.
If you can send 10 bigger jobs instead of 100 smaller jobs, you should always choose to send bigger jobs. This will reduce the number of jobs going across the network and may significantly improve the performance. Similarly, for cache entries, always try to use API methods that take collections of keys or values instead of passing them one-by-one.
Refer to JVM and System Tuning for the guidence on the GC tuning.
When running a large number of threads accessing the grid as in the case of large-scale server-side applications, you may end up with a large number of open files used both on client and server nodes. It is recommended that you increase the default values to the max defaults.
Misconfiguring the file descriptors settings will impact application stability and performance. For this we have to set both “System level File Descriptor Limit” and “Process level File Descriptor Limit”, respectively, by following these steps as a root user:
- Modify the following line in the /etc/sysctl.conf file:
fs.file-max = 300000
- Apply the change by executing the following command:
Verify your settings using:
Alternatively, you may execute the following command:
By default, Linux OS has a relatively small number of file descriptors available and max user processes (1024) configured. It is important that you use a user account which has its maximum open file descriptors (open files) and max user processes configured to an appropriate value.
A good maximum value for open file descriptors is 32768.
Use the following command to set the maximum open file descriptors and maximum user processes:
ulimit -n 32768 -u 32768
Alternatively, you may modify the following files accordingly:
/etc/security/limits.conf - soft nofile 32768 - hard nofile 32768 /etc/security/limits.d/90-nproc.conf - soft nproc 32768
See increase-open-files-limit for more details.
Updated about a year ago