Thursday, July 11, 2019

Variations in the behavior of the default Java maximum heap size

Since the last millennium, Java implementations have always had system-specific behavior for things such as the maximum amount of memory your Java program can use, also known as the "maximum java heap".

Many (most?) JVM implementations support the -Xmx flag, which allows you to specify the maximum heap size.

But if you don't set that flag, what maximum heap size do you get?

Implementations vary widely in this case.

The most commonly-used JVM implementation is the Oracle JDK, whose behavior is documented here:

In particular, see "Table 2-6 Default Maximum Heap Sizes", which shows that the actual value will

  • depend on your operating system platform,
  • on the choice of a 32-bit or 64-bit JVM,
  • and also on the amount of RAM on the machine,
  • BUT will never exceed 2GB.

However, the OpenJDK, ( behaves differently.

With some experimentation, you will find that the Default Maximum Heap Size on an OpenJDK JVM also depends on various variables, but is NOT hard-limited to 2GB, and instead appears to simply use 25% of the RAM on your machine!

I discovered this when I was trying to figure out why we were experiencing extreme memory pressure on a fleet of machines that I manage. It turns out that our administrators had moved us from OracleJDK to OpenJDK on those machines, and our workloads had a bunch of places where we were NOT specifying -Xmx.

For those workloads, each time we spawned a JVM without specifying the maximum heap size, it had (quietly) changed from using up to 2GB of memory, to using up to 25 % of the system memory, and since we routinely spawned 2 such JVMs, half of our system memory was devoted to those 2 JVMs! Since these were quite large machines, having typically 48GB to 64GB of memory, this meant that a very large memory usage change had occurred: instead of 2 JVMs consuming 4GB of RAM total, the 2 JVMs were now consuming 24-32GB of RAM total!

Big machine or small machine, 25% is a lot! Since our machines routinely run thousands of processes, it took me far too long to notice these two processes quietly occupying half the system ram.

So the overall impact was subtle and it took me a LONG time to notice it (many months, sadly), because all that really happened right away was that the workloads ran more slowly, and the machine performance seemed quite poor, and it took me far too long to understand that the reason was that 2 of the 1000+ processes on the machine were suddenly taking half of the memory on the machine.

Unfortunately, this particular behavior does not seem to be documented anywhere (at least, I haven't found the "openjdk documentation for the -Xmx flag"), although since it is open source you can always "use the source, Luke":

I estimate that I lost, over all of 2019 so far, about 1.5 weeks of my life to this particular detail.

But, happily, I finally figured it out!

No comments:

Post a Comment