Apache Ignite Documentation

GridGain Developer Hub - Apache Ignitetm

Welcome to the Apache Ignite developer hub run by GridGain. Here you'll find comprehensive guides and documentation to help you start working with Apache Ignite as quickly as possible, as well as support if you get stuck.

 

GridGain also provides Community Edition which is a distribution of Apache Ignite made available by GridGain. It is the fastest and easiest way to get started with Apache Ignite. The Community Edition is generally more stable than the Apache Ignite release available from the Apache Ignite website and may contain extra bug fixes and features that have not made it yet into the release on the Apache website.

 

Let's jump right in!

 

Documentation     Ask a Question     Download

 

Javadoc     Scaladoc     Examples

Garbage Collection Tuning

Advanced JVM tuning scenarios for high performance & stability when moving to a production deployment

Garbage Collection Tuning

Below are sets of example JVM configurations for applications that might generate high numbers of temporary objects hence triggering long pauses due to garbage collection activities.

JVMs in cluster should be constantly monitored and tuned after profile gathered. GC tunning will very much depend on application and Ignite usage pattern.

For JDK 1.8 we recommend to use G1 garbage collector and below you can see 10GB heap example for a machine with 64 CPUs with G1 being turned on:

-server 
-Xms10g 
-Xmx10g
-XX:+AlwaysPreTouch
-XX:+UseG1GC
-XX:+ScavengeBeforeFullGC
-XX:+DisableExplicitGC

Use the latest versions of Oracle JDK 8 or Open JDK 8 if you decide to use G1 collector since it has been being constantly improved.

If G1 does not suit your case, or you are using JDK 7, then you can refer to the following CMS based settings as a good starting point for JVM tuning (10GB heap example for machine with 64 CPUs):

-server
-Xms10g
-Xmx10g
-XX:+AlwaysPreTouch
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSClassUnloadingEnabled
-XX:+CMSPermGenSweepingEnabled 
-XX:+ScavengeBeforeFullGC
-XX:+CMSScavengeBeforeRemark
-XX:+DisableExplicitGC

Please note that these settings might not always be ideal so always make sure to rigorously test prior to production deployment

GC Attacks By Linux

In Linux environment, it may happen that an application faces long GC pauses or loses performance due to I/O or memory starvation because of kernel specific settings. This section gives some guidelines on how to modify kernel settings in order to overcome long GC pauses.

All the shell scripts commands given below were tested under RedHat 7. They may differ for your Linux distribution.

Also be sure to check with system statistics, logs that a problem really valid for your case before applying any kernel based settings.

Finally it's advisable to consult with your IT department before making changes at the Linux kernel level in production.

I/O issues

If GC log shows “low user time, low system time, long GC pause” then a reason could be with GC threads stuck in kernel waiting for I/O. Basically it happens due to journal commits or file system flush of changes by gzip of log rolling.

As a solution you can increase pages flushing to disk from defaul 30 seconds to 5 seconds


  sysctl –w vm.dirty_writeback_centisecs=500
  sysctl –w vm.dirty_expire_centisecs=500

Memory issues

If GC log shows “low user time, high system time, long GC pause” then most likely memory pressure triggers swapping or scanning for free memory.

  • Check and decrease 'swappiness' setting to protect heap and anonymous memory
sysctl –w vm.swappiness=10
  • Add –XX:+AlwaysPreTouch to JVM settings on startup
  • Turn off NUMA zone-reclaim optimization
sysctl –w vm.zone_reclaim_mode=0
  • Turn off Transparent Huge Pages if RedHat distribution is used
echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag

Page Cache

In cases when an application interacts a lot with an underlying file system this can lead to the situation when RAM is highly utilized by page cache. If kswapd daemon doesn't keep up with pages reclamation, used by the page cache, in background then an application can face with high latencies due to direct reclamation when it needs a new page. This situation can affect not only the performance of the application but may also lead to long GC pauses.

To get over long GC pauses caused by direct page memory reclaim on Linux with the latest kernel versions you can add extra bytes between wmark_min and wmark_low with /proc/sys/vm/extra_free_kbytes setting trying to avoid aforementioned latencies.

sysctl -w vm.extra_free_kbytes=1240000

To get more insights on the topic discussed under this section please refer to the following slides slides

Debugging Memory Usage Issues and GC Pauses

The section contains information that may be helpful when you need to debug and troubleshoot issues related to memory usage or long GC pauses.

Getting Heap Dump on Out of Memory Errors

In case your JVM is throwing an ‘OutOfMemoryException’ and the JVM process should be restarted you may add the following properties to your JVM configuration:

-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/path/to/heapdump
-XX:OnOutOfMemoryError=“kill -9 %p” 
-XX:+ExitOnOutOfMemoryError

Detailed Garbage Collection stats

In order to capture detailed information about garbage collection and its performance add the following parameters to the JVM configuration:

-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintGCDateStamps
-XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=10
-XX:GCLogFileSize=100M
-Xloggc:/path/to/gc/logs/log.txt

For G1 it's recommended to set the property below that provides many ergonomic details that are purposefully kept out of the -XX:+PrintGCDetails

-XX:+PrintAdaptiveSizePolicy

Make sure you modify the path and file names accordingly and ensure to use a different file name for each invocation in order to avoid overwriting the log files from multiple processes.

FlightRecorder Settings

In cases when you need to debug performance or memory issues you can rely on Java Flight Recorder tool that allows continuously collect low level and detailed runtime information enabling after-the-fact incident analysis. To enable Flight Recorder use the following settings below:

-XX:+UnlockCommercialFeatures
-XX:+FlightRecorder
-XX:+UnlockDiagnosticVMOptions
-XX:+DebugNonSafepoints

To start recording for a particular Java process use this command as an example

jcmd <PID> JFR.start name=<recordcing_name> duration=60s filename=/var/recording/recording.jfr settings=profile

For complete details on Java Flight Recorder refer to Oracle official documentation.

Garbage Collection Tuning


Advanced JVM tuning scenarios for high performance & stability when moving to a production deployment

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.