Wednesday, March 6, 2013

A couple of interesting papers

In between compiles, I've been spending some time with:

  • Nobody ever got fired for buying a cluster:
    We claim that a single "scale-up" server can process each of these jobs and do as well or better than a cluster in terms of performance, cost, power, and server density. Is it time to consider the "common case" for "big data" analytics to be the single-server rather than the cluster case? If so, this has implications for data center hardware as well as software architectures.
  • Optimizing Google’s Warehouse Scale Computers: The NUMA Experience
    It is overwhelmingly challenging to diagnose and attribute this performance swing to individual microarchitectural factors. Effects such as the contention for cache/bandwidth with various corunning applications on a server, non-uniform memory accesses (NUMA), and I/O interference among other factors all carry implications on the effectiveness of the policies used for cluster-level scheduling, machine-level resource management, and the execution configurations of the Gmail servers.
  • Themis: An I/O-Efficient MapReduce
    Given that many MapReduce jobs are I/O-bound, an efficient MapReduce system must aim to minimize the number of I/O operations it performs. Fundamentally, every MapReduce system must perform at least two I/O operations per record when the amount of data exceeds the amount of memory in the cluster. We refer to a system that meets this lower-bound as having the “2-IO” property. Any data processing system that does not have this property is doing more I/O than it needs to. Existing MapReduce systems incur additional I/O operations in exchange for simpler and more fine-grained fault tolerance.

    In this paper, we present Themis, an implementation of MapReduce designed to have the 2-IO property.

  • MinuteSort with Flat Datacenter Storage
    The sorts were accomplished using a heterogeneous cluster consisting of 256 computers and 1,033 disks, divided broadly into two classes: storage nodes and compute nodes. Notably, no compute node in our system uses local storage for data; we believe FDS is the first system with competitive sort performance that uses remote storage. Because files are all remote, our 1,470 GB runs actually transmitted 4.4 TB over the network in under a minute.

No comments:

Post a Comment