Saturday, November 28, 2009

Coders at Work: Guy Steele

Chapter 9 of Coders at Work contains Peter Seibel's interview of Guy Steele. Steele is another in the progression of language designers that Seibel chose to include in the book: Crockford, Eich, Bloch, Armstrong, Peyton Jones, and now Steele. Although programming language design is clearly one of the most important fields within Computer Science, I do wish Seibel had balanced his choices somewhat, to include more coders from other areas: operating systems, networking, databases, graphics, etc. Still, Steele is an interesting engineer and I quite enjoyed this interview.

Having heard from Zawinski about some of the later developments in the work on Emacs and Lisp, it is interesting to hear Steele talk about some of the very early work in this area:

One of the wonderful things about MIT was that there was a lot of code sitting around that was not kept under lock and key, written by pretty smart hackers. So I read the ITS operating system. I read the implementations of TECO and of Lisp. And the first pretty printer for Lisp, written by Bill Gosper. In fact I read them as a high-school student and then proceeded to replicate some of that in my 1130 implementation.

This description of learning how to program by reading the programs of others, was wide-spread. It is certainly how I learned to program. Although I think that computer science education has come a long way in 30 years, I think that the technique of reading code is still a wonderful way to learn how to program. If you don't like reading code, and don't develop a great deal of comfort with reading code, then you're not going to enjoy programming.

Steele talks about the need to have a wide variety of high quality code to read:

I would not have been able to implement Lisp for an 1130 without having had access to existing implementations of Lisp on another computer. I wouldn't have known what to do. That was an important part of my education. Part of the problem we face nowadays, now that software had become valuable and most software of any size is commercial, is that we don't have a lot of examples of good code to read. The open source movement has helped to rectify that to some extent. You can go in and read the source to Linux, if you want to.

I think that the open source movement is an excellent source of code to read; in addition to just reading the code, many open source projects have communities of programmers who love to talk about the code in great detail, so if you have questions about why the code was written the way it was, open source projects are usually very willing to discuss the reasoning behind the code.

In addition to early work on Lisp, Steele was also present for the invention of the Emacs editor, one of the most famous and longest-living programs in existence:

Then came the breakthrough. The suggestion was, we have this idea of taking a character and looking it up in a table and executing TECO commands. Why don't we apply that to real-time edit mode? So that every character you can type is used as a lookup character in this table. And the default table says, printing characters are self-inserting and control characters do these things. But let's just make it programmable and see what happens. And what immediately happened was four or five different bright people around MIT had their own ideas about what to do with that.

In retrospect, a WYSIWYG text-editing program seems so obvious, but somebody had to think of it for the first time, and to hear first hand from somebody who was actually part of that process is great!

My favorite part of the Steele interview, however, was this description of programming language design, which, again, sounds simple in retrospect, but really cuts directly to the core of what programming language design is trying to achieve:

I think it's important that a language be able to capture what the programmer wants to tell the computer, to be recorded and taken into account. Now different programmers have different styles and different ideas about what they want recorded. As I've progressed through my understand of what ought to be recorded I think that we want to say a lot more about data structures, we want to say a lot more about their invariants. The kinds of things we capture in Javadoc are the kinds of things that ought to be told to a compiler. If it's worth telling another programmer, it's worth telling the compiler, I think.

Exactly, and double-exactly! Firstly, Steele is absolutely right that most programming languages concentrate far too much on helping programmers describe control flow and not enough on helping programmers describe data structures. Most, perhaps nearly all, of the bugs and mistakes that I work on have to do with confusion about data structures, not with confusion about control flow. When reading most programs, the control flow is simple and evident, but teasing out the behavior of the data structures is often horrendous.

And, secondly, I just love the way Steele distills it:

If it's worth telling another programmer, it's worth telling the compiler, I think.

Steele is apparently working on a new programming language, Fortress. It will be interesting to see how it turns out.

No comments:

Post a Comment