Saturday, September 18, 2010

Odersky's paper on compilation techniques for Scala

Scala is the programming language developed by Martin Odersky, If you're not familiar with Scala, here's a good place to start.

I recently made my way through Dubochet and Odersky's Compiling Structural Types on the JVM, which is a fascinating and quite approachable paper for those interested in compilers generally, and more particularly in compilation techniques for object-oriented languages.

The general topic of the paper involves the problem of dealing with structural types:

A type is a nominal subtype of another one if there exists, somewhere in the program, an explicit mention of this fact. In Java, such an explicit mention takes the form of an extends clause. Structural subtyping, also known as duck typing, declares a type to be [a] subtype of another if their structures -- their members -- allow it. At the simplest, a type is allowed to be a subtype of another if it exposes at least the same members.

If you aren't familiar with the phrase 'duck typing', it comes from the old folk saying: "if it walks like a duck, and quacks like a duck, it's a duck".

The core program addressed by this work involves the choice between generative compilation and reflective compilation:

Generative techniques create Java interfaces to stand in for structural types on the JVM. The complexity of such techniques lies in that all classes that are to be used as structural types anywhere in the program must implement the right interfaces.


Reflective techniques replace JVM method call instructions with Java reflective calls. Reflective calls do not require a priori knowledge of the type of the receiver: they can be used to bypass the restrictions of JVM method calls. The complexity of such techniques lies in that reflective calls are much slower than regular interface calls.

The point of the paper is to describe the work done by the Scala team which has succeeded in making reflexive implementation of structural subtyping successful.

The core of the technique is to carefully and precisely use caching techniques to ensure that the reflection overhead occurs as few times as possible.

There are other issues, including: boxing and type-casting, exception handling, and handling parameterized types. But the core of the work involves the caching implementation, and the Scala team have provided lots of clear and useful descriptions of how their caching algorithms work, together with a number of benchmarks to analyze the effectiveness of the compilation techniques.

The caching techniques that the Scala team have settled on are derived from work that Urs Hozle did on the Self language while in graduate school at Stanford; in particular, Hozle, Chambers and Ungar's paper on Optimizing Dynamic Object-Oriented Langauges with Polymorphic Inline Caches, which is nearly 20 years old now, but still well worth reading.

Oh, well, enough about all that programming language stuff. The rain has stopped (briefly); it's time to take the dog down to the grass for some exercise.

No comments:

Post a Comment