Tuesday, December 22, 2009

Language subsetting

On the Stack Overflow podcast recently, Joel and Jeff were discussing the topic of: when and why do programmers intentionally restrict themselves to using only a subset of the functionality available to them in their programming language.

At first it seems like an odd behavior: if you have features available in your programming language, why would you not use them?

I think there are (at least) 4 reasons: 3 valid reasons and 1 bad reason. Here's my list:

  • Complexity. C++ is a perfect example of this. C++ is such an enormous language, with so many features and possibilities and variations on ways of getting things done, you'll end up creating incomprehensible, illegible, unmaintainable programs. So pretty much every C++ organization I've ever worked with has decided to restrict themselves to some simpler subset of the language's features

  • Vendor portability. SQL is a perfect example of this. There are many different implementations of SQL: Oracle, DB2, SQL Server, Sybase, MySQL, Postgres, Ingres, Derby, etc., and each of them has implemented a different subset of SQL's features, often with slightly different semantics. If you are writing a database-related application, you often find yourself wanting to be careful about particular database behaviors that you are using, so that you can "port" your application from one database to another without too many problems.

  • Version compatibility. Java is a perfect example of this. Over the years, there have been multiple versions of Java, and later releases have introduced many new features. But if you write an application against a new release of Java, using the new release's features, your application probably won't run in an older release of Java. So if you are hoping for your application to be widely used, you are reluctant to use those latest features until they have found widespread deployment in the Java community. Currently, it's my sense of things that JDK 1.4 is still widely used, although most Java environments are now moving to JDK 1.5. JDK 1.6 is commonly used, but it's still somewhat surprising when you encounter a major Java deployment environment (application server, etc.) which has already moved to the JDK 1.6 level of support. So most large Java applications are only now moving from JDK 1.4 to JDK 1.5 as their base level of support. The current version of Ant, for example, still supports JDK 1.2!

  • Unfamiliarity. This is the bad reason for restricting yourself to a subset of your programming language's capabilities. Modern programming languages have an astounding number of features, and it can take a while to learn all these different features, and how to use them effectively. So many programmers, perhaps unconsciously, find themselves reluctant to use certain features: "yeah, I saw that in the manual, but I didn't understand what it was or how to use it, so I'm not using that feature". This is a shame: you owe it to yourself, each time you encounter such a situation, to seize the opportunity to learn something new, and stop and take the time to figure out what this feature is and how it works.

So, anyway, there you go, Jeff and Joel: ask a question (on your podcast) and people will give you their answers!

1 comment:

  1. Ant's trailing edge-ness holds back a lot of the codebase, but it saves having to maintain a backport and the goal is "the build tool should not force you to upgrade". Hadoop is 1.6 only as its the only one with adequate performance, and because it isn't trying to be just a tool. So maybe the app's role in the coding ecosystem helps dictate it's Java version.

    The real troublespot here, then is Junit 4. Java 1.5+ only...