Wednesday, August 6, 2014

Memory Matters

I loved this recent essay from one of the top PostgreSQL developers: Memory Matters.


Under any circumstances, reading from disk is vastly slower than reading from memory, but reading data from disk sequentially is 10 to 100 times faster than random I/O. Unfortunately, it's often the case that the task which is evicting data from memory is writing data sequentially while the underlying database workload is typically accessing some working set of pages in a more-or-less random fashion. The result is that data is removed from the cache at a vastly higher rate than it can be read back in. Even after the bulk operation terminates and the cache-purging ceases, it can take a painfully long time - sometimes many hours - for random I/O to bring all of the hot data back into memory.

I think it's possible that he meant "reading data sequentially," not "writing data sequentially,", but that's a nit.

The whole essay is great. I probably give this advice about 3x a week at work, and never word it anywhere nearly so well, so it's nice to have a great reference to point to.

And I really love his two final points:

  1. If adding memory doesn't seem to help, it's possible that you just haven't added enough.
  2. Memory is different: because of the way operating system and PostgreSQL caching works, it's likely that substantially all of your memory will be in use all the time


No comments:

Post a Comment