Wednesday, July 1, 2009

The peculiar plateauing of memory sizes

I was struck by a recent interview with Gil Tene of Azul posted on Artima . In the interview they discuss the surprising observation that

the practical size of a single [application] instance hasn't changed since about 2000.

I think this is very interesting; I've seen very similar results, and wondered the same thing. Certainly their observations about the progress in server memory capacity seems true; witness this recent announcement of a machine from Cisco that supports 384 Gb of physical memory on a single server.

The Artima interview suggests that one stumbling block for such large applications involves the fact that many modern applications are written using modern VM-based languages, such as Java or .NET, and that many of these virtual machines don't handle very large memory sizes well:

We know that most VMs tend to work well with one or two gigabytes, and when you go above that, they don't immediately break, but you end up with a lot of tuning, and at some point you just run out of tuning options.

I don't have a lot of experience with trying to run JVMs at that size; our in-house testing systems rarely have more than 4 Gb of physical memory, so I've mostly built test rigs by combining several smaller nodes, rather than exploring the scaling behaviors of a single large node.

I suspect that, in addition to the basic JVM problems, it is also true that applications aren't generally coded well to handle enormous amounts of physical memory. Many Java programmers haven't though much about how to build in-memory data structures that can scale to stupendous sizes. When your HashMap has only a few tens of thousands of instances, you can survive the occasional inefficiency of, for example, opening an Iterator across the entire map to search for an object via something other than the hash key, but if your collections have millions of objects, your application will slow to a crawl.

As a counter-example of the Artima article's claim, however, let me point to the Perforce server. Our internal Perforce server runs on a dedicated machine with 32 Gb of real memory, and Perforce does an excellent job of using that memory efficiently. Over the last decade, we have upgraded our Perforce environment from a 8 Gb machine, to a 16 Gb machine, to the current 32 Gb machine, and Perforce's server software has automatically adapted to the memory in a trouble-free fashion, using the resources effectively and efficiently.

At the time, I thought that was a pretty big machine, but of course Google has been running Perforce on a machine with 128 Gb of physical memory since 2006, and they're probably on a larger machine now, and VMWare have been running Perforce on a machine with 256 Gb of physical memory since 2008 (in VMWare's case, they also give their machine solid-state disk since they apparently couldn't give it enough physical memory).

I seem to recall reading that Google had to build their own physical hardware for their Perforce server, as at that time you couldn't buy a machine with that much memory from a standard vendor. I wonder if this is still true; I just wandered over to HP's web site to see what their systems look like, and it seems like their servers generally max out at 128 Gb of physical memory. However, Sun seem to be advertising a system that can handle up to 2 Terabytes of physical memory, so obviously such systems exist.

Of course, nowadays big companies don't even think about individual computers anymore :)

1 comment:

  1. we (HP) go up to 500GB on the DL700 kit, though the focus there is hosting many virtual machines.

    When you jump from 32 bit JVMs tp 64-bit ones, you take a memory hit of about 50% from pointer size, though the JRockit JVM compresses pointers. Sadly, you can't D/L that JVM no more, but maybe it will come back with Java7