Well, here's a nice body of work by a team at Stanford, showing how they are using the basic concepts of MapReduce (functional programming, problem decomposition, parallelism, resource management) to accomplish parallel programming tasks on single-system configurations.
Google's MapReduce implementation facilitates processing of terabytes on clusters with thousands of nodes. The Phoenix implementation is based on the same principles but targets shared-memory systems such as multi-core chips and symmetric multiprocessors.
Most parallel and concurrent programming APIs are too hard: to complex to understand, and to easy to use incorrectly. MapReduce has been successful over the last 15 years because it nicely balances the power of parallelism with a clear and simple programming abstraction.