Journal of a Programmer: December 2012

Sunday, December 30, 2012

A New Year's reading list

Nerd alert! It's reading list time!

How to Write your own Minesweeper AI. A nifty worked-out example of how to write a game-solving algorithm, with lots of illustrations.
How much of a speedup do we get? In this case, the green region has 10 tiles, the pink has 7. Taken together, we need to search through 2^17 combinations. With segregation, we only have 2^10 + 2^7: about a 100x speedup.
Practically, the optimization brought the algorithm from stopping for several seconds (sometimes minutes) to think, to giving the solution instantly.
Occupy ACM: We are the 99%. Some interesting notes about the current state of affairs w.r.t. the ACM and Open Access publishing of academic results. For the record, I gave up on the ACM some 20 years ago, so I'm just an observer in this debate. But I don't believe I've suffered from that decision, other than being unable to have discussions with certain researchers who insist on making their research available only to a select few. In that respect, I agree with the author when he says:
Since these days almost everyone puts their papers on electronic archives, perhaps these backward stances of ACM don’t have too much negative effect on actual access to papers. But they may have very negative effects on the ACM itself.
What does randomness look like?. A nicely-illustrated article about human psychology and why we're innately wired to have trouble with the concept of randomness.
Poisson’s idea was both ingenious and remarkably modern. In today’s language, he argued that Quitelet was missing a model of his data. He didn’t account for how jurors actually came to their decisions. According to Poisson, jurors were fallible. The data that we observe is the rate of convictions, but what we want to know is the probability that a defendant is guilty. These two quantities aren’t the same, but they can be related. The upshot is that when you take this process into account, there is a certain amount of variation inherent in conviction rates, and this is what one sees in the French crime data.
Nulls Make Things Easier? (Part 1/11). Bruce Momjian has embarked upon what looks like will become a nice series of articles about the complexities of NULL handling in relational databases.
even if you never type "null", you can get nulls into your database. The use of not null when creating columns is recommended, especially for numeric columns that should contain only non-null values. Ideally you could have not null be the default for all columns and you would specify null for columns that can contain nulls, but that is not supported.
This looks like a nice online book: FXT: a library of algorithms. Many thanks to the author for making it available. I probably won't spend much time working my way through it, since I already own (and periodically consult) the first edition of Henry Warren's astonishingly-great Hacker's Delight.
My former colleague (and wonderful writer) Robert Hodges takes a step back and considers the state of the MySQL community: The MySQL Community: Beleaguered or Better than Ever?
If there is a problem, it is how to keep a strong multi-polar community going for as long as possible. Competition creates uncertainty for users, because change is a given. Pointy-haired bosses have to make decisions with incomplete information or even reverse them later. Competition is hard for vendors, because it is more difficult to make money in efficient markets. Competition even strikes against the vanity of community contributors, who have to try harder to get recognition.
An epic year-end ramble from the well-known David Heinemeier Hansson: The Parley Letter. As I'm constantly fighting with those (well-intentioned, but wrong) folk who keep insisting that we have to start by first writing a framework, I found myself cheering as Heinemeier Hansson writes:
I think the responsible thing is to make the best damn piece of software facing CURRENT DAY constraints and let tomorrow worry about tomorrow. When the actual changed requirements or new people with "new ideas" roll around, they'll have less baggage to move around. The predictions for what the app is going to look like a decade from now are pretty likely to be wrong anyway.
Four interesting articles about network performance, page load speed, and web caching:
Finally, two very interesting GitHub-related articles:
- GitHub Says ‘No Thanks’ to Bots — Even if They’re Nice
  On GitHub, these offers — called pull requests — are supposed to come from people. That’s part of the power of GitHub — coders can see each other’s software and share fixes much in the same way that the rest of us swap photos on Facebook. It’s a very social, and very human kind of interaction. The whole world can debate the merits of a software change before it is accepted or rejected.
  But here was a pull request from a GitBot. Bots don’t debate.
- Downtime last Saturday
  When the agent on one of the switches is terminated, the peer has a 5 second timeout period where it waits to hear from it again. If it does not hear from the peer, but still sees active links between them, it assumes that the other switch is still running but in an inconsistent state. In this situation it is not able to safely takeover the shared resources so it defaults back to behaving as a standalone switch for purposes of link aggregation, spanning-tree, and other layer two protocols.
  By the way, regarding the DRBD issues discussed in the GitHub post-mortem, here are some interesting background resources you might want to study:
  - DRBD: What is DRBD?
  - Magical Block Store: When Abstractions Fail Us

Have a happy and safe New Year's holiday!

Friday, December 28, 2012

Old Man's War: a very short review

One of my holiday books was John Scalzi's Old Man's War.

I actually got the book as part of the Humble eBook Bundle, a charity event I participated in last October.

Old Man's War sat around on my Kindle, gathering bits, for a few months, until I found some reading time over the holiday break.

I think that the publisher's blurb does a great job of describing Old Man's War:

John Perry did two things on his 75th birthday. First he visited his wife's grave. Then he joined the army.
The good news is that humanity finally made it into interstellar space. The bad news is that planets fit to live on are scarce--and alien races willing to fight us for them are common. So: we fight. To defend Earth, and to stake our own claim to planetary real estate. Far from Earth, the war has been going on for decades: brutal, bloody, unyielding.
Earth itself is a backwater. The bulk of humanity's resources are in the hands of the Colonial Defense Force. Everybody knows that when you reach retirement age, you can join the CDF. They don't want young people; they want people who carry the knowledge and skills of decades of living. You'll be taken off Earth and never allowed to return. You'll serve two years at the front. And if you survive, you'll be given a generous homestead stake of your own, on one of our hard-won colony planets.
John Perry is taking that deal. He has only the vaguest idea what to expect. Because the actual fight, light-years from home, is far, far harder than he can imagine--and what he will become is far stranger.

Old Man's War was such a success that Scalzi wrote a number of sequels, and the book is still hugely popular nearly 10 years after its original publication.

I don't read a lot of science fiction, but if more of it were written like Scalzi's work, I probably would. Old Man's War is a great read, with lots of action and adventure.

As with most science fiction, Old Man's War starts with a premise, a "what if?" question, and explores the implications of a world in which that question is answered.

And, as with most (good) science fiction, it turns out that Old Man's War is mostly about other topics, anyway, and the entire story and its themes are metaphor for much more traditional events back here on Earth. In this case: war, age, love; the same things that fascinated Shakespeare and Tolstoy back in their day. What's old is new again; in Old Man's War, literally!

If you like science fiction, I'm sure you're already quite familiar with Scalzi's work. But if you're like me, and don't pay that much attention to science fiction, but enjoy a good book, give Old Man's War a try; you'll probably enjoy it.

Thursday, December 27, 2012

Scalable Component Abstractions

The German computer scientist Martin Odersky is world-famous for his work in programming language theory.

Recently, he has become particularly famous for the invention of the programming language Scala.

At this point, Scala has matured significantly; there is a lot of documentation, and plenty of books, examples, etc.

But for the underlying theory behind Scala, it is valuable to go back to some of Odersky's original papers, including this one by Odersky and Zenger in 2005: Scalable Component Abstractions.

We identify three programming language abstractions for the construction of re-usable components: abstract type members, explicit selftypes and symmetric mixin composition. Together, these abstractions enable us to transform an arbitrary assembly of static program parts with hard references between them into a system of re-usable components.

Designing a programming language that successfully provides for reusable components is one of the holy grails of computing; theorists have been working on this for decades, slowly making progress. Reusable components, if implemented well, enable large teams of people to work together on the construction of large systems.

Scalable Component Abstractions describes the theoretical underpinnings of Scala's approach to componentization. Papers in programming language theory can often be dense and abstract, but Scalable Component Abstractions is clearly written with well-chosen examples.

More importantly, after developing the theory of abstract type members, explicit selftypes and symmetric mixin composition, they demonstrate how these tools were successfully used in a real system; for example, they modify the existing Scala compiler to add trace-logging functionality:

The crucial point is that we want to extend an existing compiler with logging functionality. To do this, we do not want to modify the compiler’s source code. Neither do we want to require of the compiler writer to have pre-planned the logging extension by providing hooks. Such hooks tend to impair the clarity of the code since they mix separate concerns in one class. Instead, we use subclassing to add logging functionality to existing classes.

As they observe, it is interesting to compare this to one of the major alternate techniques for solving this problem, Aspect-Oriented Programming:

our architecture can handle all before, after, and around advice on method reception pointcut designators. These represent only one instance of the pointcut designators provided by languages such as AspectJ. Therefore,general AOP is clearly more powerful than our scheme. On the other hand, our scheme has the advantage that it is statically typed, and that scope and order of advice can be precisely controlled using the semantics of mixin composition.

I've tried AOP a few times, and while it works, I've found it awkward to deploy, and rather fragile. So I'd be willing to give up some of its power in return for a simpler, more transparent, and more reliable solution.

I haven't yet done any serious programming in Scala, but I've been following the language's development for several years now, and it's becoming increasingly common to see major systems implemented in Scala (for example, the Twitter team use Scala extensively, as I mentioned earlier this week), so I suspect it won't be too long before I find myself working on projects where Scala is a significant part of the solution.

If you haven't been paying attention to Scala, you might find it interesting to explore; I think it's powerful and well-designed, and has a bright future ahead.

Wednesday, December 26, 2012

Some nice networking papers

It's quiet in the house today; others have to work, but I get to lounge around reading computer science papers :)

If you, too, find yourself with a few quiet hours, you might find these items interesting:

Analyzing Big Data with Twitter. The UC Berkeley School of Information held a special class this fall with Twitter as the subject. The school kindly taped the lectures, and put them up on YouTube for your learning pleasure!
How to store, process, analyze and make sense of Big Data is of increasing interest and importance to technology companies, a wide range of industries, and academic institutions. In this course, UC Berkeley professors and Twitter engineers will lecture on the most cutting-edge algorithms and software tools for data analytics as applied to Twitter microblog data. Topics will include applied natural language processing algorithms such as sentiment analysis, large scale anomaly detection, real-time search, information diffusion and outbreak detection, trend detection in social streams, recommendation algorithms, and advanced frameworks for distributed computing. Social science perspectives on analyzing social media will also be covered.
A team of students at Stanford set out to reproduce various network research findings, by setting up and re-performing the experiments discussed in the research literature. They report on their results in the paper: Reproducible Network Experiments Using Container-Based Emulation
We report lessons learned from a graduate networking class at Stanford, where 37 students used our platform to replicate 16 published results of their own choosing
Reproducing the research results of others is a wonderful teaching tool; I'm pleased to see it being used successfully.
For their ﬁnal project, students were given a simple, openended request: choose a published networking research paper and try to replicate its primary result using Mininet-HiFi on an Amazon EC2 instance.
The students not only reproduced but improved upon the research results:
After three weeks, 16 of the 18 teams successfully reproduced at least one result from their chosen paper; only two teams could not reproduce the original result. Four teams added new results, such as understanding the sensitivity of the result to a parameter not in the original paper.
Lastly, a ten-year-old paper that is new to me: Practical Self-Stabilization for Tolerating Unanticipated Faults in Networked Systems
As the complexity of networked systems increases, their likelihood of experiencing unanticipated faults sooner or later also grows. By unanticipated network faults, we mean not only (a) occurrence of new types of faults that were not explicitly planned for in network design, but also (b) occurrence of well known types of faults but at frequencies that are abnormal.
...
Stabilization provides one alternative that avoids case-by-case handling of unanticipated faults of sort (b) and can deal with faults of sort (a). We illustrate this point with an abstract explanation of stabilization: Stabilization involves deﬁning “correct” behavior of the network and by continually checking whether the network is presently conforming to this behavior. In other words, instead of deﬁning speciﬁc patterns of incorrect behavior, stabilization checks for occurrence of any anomalous behavior. Should incorrect behavior be detected, stabilization promises to restore the network behavior so that eventually it is correct.
. It's not clear to me if the team is still working on these ideas, but you can find some of their older work from their research page.

Monday, December 24, 2012

Looking back at what was

It's that end-of-the-year time, when people look back on what was.

But sometimes it's fun to look back even farther.

So, without further ado, a handful of fun, unrelated stories, each looking further and further back in time:

History in the humblest of places. Dana MacKenzie takes a trip back in time and considers the first chess tournament of Ray Robson, who has gone on to become one of America's greatest chess players.
When I go to tournaments with lots of kids running around, I confess that they get on my nerves sometimes. They make so much noise, there are so many of them, and I know that 90 percent of them will quit in a few months or years.
But Robson’s history is a good reminder that every child matters and every game matters. See that player on board 146 in the beginner’s section, the one who is barely tall enough to see over his pieces? Take a good look, because one day you might see him on board 1 in the national championship.
Moving back in time 150 years or so, don't miss this extraordinary tale of adventure right here in the San Francisco Bay, from William Brewer's diary: December 22, 1862: San Francisco Bay
Now came the most intense excitement, the boatload of passengers struggled in the water, several others jumped in, excited men were rushing frantically about, and, to increase the confusion, the boat took fire in the hold, and to extinguish it the steam from the boiler was blown there, which filled the whole vessel. Most luckily the lifeboat was righted and all were rescued from the water, the boat was bailed out, the wet passengers were put in, and she pushed off.
Having trouble imagining the scene? Here's a wonderful visual to accompany Brewer's story: Birdseye Map: Vue de San Francisco (1860). Unfortunately the artist's perspective is east, taking in Yerba Buena Island with Mount Diablo in the background; Alcatraz Island would be just to the left of the field of view.
Of course, San Francisco Bay wasn't always a bay. The world has changed many times over. California State Archaeologist Breck Parkman (what better name for someone who works for the California State Parks?) takes a much longer step back in time, and considers what the Bay Area was like in the late Pleistocene era: The California Serengetti: Two Hypotheses Regarding the Pleistocene Paleoecology of the San Francisco Bay Area
Perhaps the greatest diversity and concentration of wildlife in existence today is found on the Serengetti Plains of East Africa. Paleontologists note that the Serengetti of the late Pleistocene was many times richer in terms of its wildlife. The California Serengetti is thought to have been even richer yet. Indeed, a magnificent array of wild animals characterized the San Francisco Bay area during the late Pleistocene. While some of the Rancholabrean species still exist in the area (e.g., deer and mountain lion), many others went extinct between 13,000-10,000 yrbp (e.g., mammoth and saber-tooth cat). The animals of the late Pleistocene can be categorized by whether they were predator or prey species and whether they were loners or moved in packs, prides, and herds.

One Day: a very short review

I'm about two years late (typical), but I finally got around to David Nicholls's One Day.

The book's concept is to tell a story that spans two decades, by relating only the events of July 15th of each year. One year at a time, we learn about what happens on that one day (hence the title).

One Day is an easy book to consume: you can grasp the story from the very first page, and Nicholls's light touch and wonderful gift for dialogue make the book race along. And it's obviously the sort of book that would certainly become a Hollywood movie.

Unlike many a book, where I find myself in love with the book during its middle, only to struggle through to the end, One Day is a book that starts rough but gains strength as it goes, and has an extremely strong and compelling finish.

A few chapters in, I absolutely hated the book: it was dark, depressing, miserable. But I was locked into the two characters and the once-a-year technique made it possible to really follow a life-long story in a book of manageable size.

Part of my struggle with the book involves its language: American readers like myself won't know words like "bin-liner", "lorry", "rucksack", or "wet room", and will just completely fall apart when presented with something like:

'Look at these legs.' She held a tiny twist of hair between her finger and thumb. 'I've got the legs of some fifty-eight-year-old fell-walker. I look like the President of the Ramblers Association.

I'm sure this is true of most British authors, but the book struck me as a modern version of some Emily Bronte or Jane Austen novel. The lead character is of course named Emma; it's a story about lovers in different economic situations trying to overcome the expectations of family and society, and it's replete with little nuggets like this:

'Like a mountain goat, me. I used to go hiking a lot at home, when I was in my Cathy phase. Out on the wild and windy moors. Dead soulful I was. "I cannot live without my life! I cannot live without my soul!"'

One Day is funny, bitter, hopeful, and tragic.

But I soldiered on, and perhaps you will, too. I wouldn't say you should necessarily rush out and move One Day to the top of your reading list, but if it finds its way onto that list, give it a go: I think you'll finish it, and I think you'll enjoy the time you spend in its crazy world.

Saturday, December 22, 2012

Database map from the 451 Group

While rummaging around on the internet the other day, I came across this interesting depiction of the current state of database software. It's an "infographic", inspired by subway maps like the famous London Underground map, attempting to relate a long list of database implementations of various types.

My great friend Walt used to say that "any field of study can always be infinitely sub-divided". There are many, many ways to slice the pie, and many, many different ways to build database software.

I've been working in the database field for 30 years now, and it never gets stale. The old ideas continue to bear up under study and refinement, and as new ideas come along, they are incorporated into the mix. We "high priests of a low cult" continue to beaver away at the details, and the world of database software incrementally becomes more sophisticated, more complex, and more powerful.

I'm familiar with about a third of the database systems described on the map, which means I've still got a lot to learn! Of course, to keep abreast of all of this activity, you'd really have to make it a full time job, like the 451 guys do, or like somebody like Curt Monash does, and I don't do that (I'm a systems builder, not an industry analyst, after all).

But I'm looking forward to following up on some of these references and learning more about parts of my field that I wasn't even aware of.

Friday, December 21, 2012

Exactly

Instagram, explained.

Thursday, December 20, 2012

Real-world security topics

Lots of security research is very theoretical and abstract, but there is plenty of concrete and immediate information to keep abreast of.

To wit

A nice post on the SpiderLabs blog reviews the recent Microsoft BlueHat conference and the Seattle B-Sides conference, as well as some other related work.
The talks at BlueHat started with Ellen Cram Kowalczyk’s talk on Fraud and Abuse in which she talked about how prevalent fraud is and how easy password recovery question answers can be recovered from online data. Examples of randomly generated passwords were presented and one attendee was able to recite it back to her. She ended her talk with a Internet scavenger hunt for answers to what could be password recovery questions for an online account. A simple Internet search can deliver information about your mother’s maiden name, your first car, last attended school, and even the food you hate most.
Piecing together the bits of your identity from information scattered around the net is one of the ways that computers can figure out who you are, even when you're trying not to let them know. The somewhat awkward term "deanonymization" describes this process, and Professor Arvind Narayanan of Princeton brings us up to date on the current state of the art in this area: New Developments in Deanonymization.
Let’s move on to the more traditional scenario of deanonymization of a dataset by combining it with an auxiliary, public dataset which has users’ identities. Srivatsa and Hicks have a new paper with demonstrations of deanonymization of mobility traces, i.e., logs of users’ locations over time. They use public social networks as auxiliary information, based on the insight that pairs of people who are friends are more likely to meet with each other physically.
The Srivatsa/Hicks paper is here: Deanonymizing Mobility Traces: Using Social Networks as a Side-Channel.
We examine a two step solution to match the contact graph against the social network. In the ﬁrst step we bootstrap the matching problem by exploiting inherent heterogeneity in the graphs to identify landmark nodes. In the second step we extend a mapping between landmark nodes to all the nodes in the graph by identifying discriminating features in the original graph.
As Professor Narayanan observes, this is not just academic research:
I have been approached multiple times by organizations who wanted me to deanonymize a database they’d acquired, and I’ve had friends in different industries mention casually that what they do on a daily basis to combine different databases together is essentially deanonymization.
Since a substantial part of the problem involves malware that, in one way or another, tricks you into revealing something that you shouldn't have revealed, an interesting approach is to make your computer more alert, and have it do a better job of trying to stop you from accidentally being fooled. Google's Chrome browser has always been an exemplar of this approach, but two other recent unrelated ideas caught my eye:
- A nice paper by Chuan Yue suggests various ways in which browsers can help users by alerting them to situations where an inappropriate password request may be being made: Preventing the Revealing of Online Passwords to Inappropriate Websites with LoginInspector.
  The key idea of LoginInspector is to continuously monitor a user’s login actions and securely store hashed domain-specific successful login information to an in-browser database. Later on, whenever the user attempts to log into a website that does not have the corresponding successful login record, LoginInspector will warn and enable the user to make an informed decision on whether to really send this login information to the website.
- A MIT team have built a nice GMail extension that looks at your inbox and helps you figure out when an email contains untrustworthy information: Lazy Truth
  The LazyTruth inbox extension surfaces pre-existing verified information to debunk viral rumors when the information is needed most: in our inboxes. The gadget is triggered by the unique phrases used in the most common viral emails tracked by factchecking and urban rumor websites.
  When you receive a viral email full of fallacies, LazyTruth retrieves and displays a verified rebuttal, and provides you with the original sources. It all happens right in your inbox, without requiring you to search anywhere.
As Ross Anderson points out so clearly, a major component of computer security involves how the economic incentives are arranged. Economic incentives are a social contract, not a technology, and so you have to think about them differently.
A particularly vivid illustration of these economic incentives occurs with vulnerability disclosure, as Princeton Professor Ed Felton discusses on his blog: You found a security hole. Now what?.
As a researcher I have always felt that when a company is willing to engage constructively, the ethical course is to cooperate with them for the benefit of the public.
That approach becomes harder to sustain when the perceived risk of legal action, whether due to an overzealous lawyer or a research error, gets larger.
At the same time, an alternative outlet for vulnerability information is emerging–selling the information. In principle it could be sold to the maker of the flawed product, but they probably won’t be the high bidder. More likely, the information will be sold to a government, to a company, or to a broker who will presumably re-sell it.

Personally, I've never discovered an important security vulnerability in someone else's software on my own. However, I have been involved with the opposite side of the coin: I've had multiple situations in which security researchers have contacted me (more precisely: my organization) with notice of a newly-discovered vulnerability.

Interestingly, I've had about the same number of such incidents in my open source work as in my industry work.

Happily, in all the situations I was involved with, the researchers notified us first (as far as we know), and were willing to give us a reasonable amount of time to repair the problem and distribute a fix prior to announcing the vulnerability to the world.

I believe that this record of success is at least partly due to the fact that the organizations I've been involved with have also responded appropriately: gratitude to the external party who notified us of the problem, and immediate response to address the issue.

It doesn't seem like it ought to be impossible for the world to work this way, but as the iPad identity leak incident demonstrates, it's easy for this fragile approach to spin wildly out of control.

Wednesday, December 19, 2012

Stuff I'm reading on my "vacation"

My wife asks me: what are you going to do on your vacation? Well, there's a computer here, and it's quiet, and (for a week at least) I don't have any projects underway at work, so I guess maybe I'll just read...

At the Annual Computer Security Applications Conference (ACSAC) 2012, Ross Anderson gives a talk on the (approximate) 10 year anniversary of the birth of "Security Economics".
After sabbatical visits to Berkeley in 2001–2 to work with Hal Varian, we organised the ﬁrst Workshop on the Economics of Information Security in June 2002. Since then the ﬁeld has grown to encompass arguments over open versus proprietary systems, the econometrics of online crime, the behavioural economics of security and much else. It has started to have a signiﬁcant impact on policy, with security-economics studies of cybercrime and infrastructure vulnerability being adopted as policy in the EU, while security economics PhDs have got inﬂuential jobs in the White House and elsewhere.
Jean-Louis Gassee (remember him?) completely eviscerates Meg Whitman in this point-by-point deconstruction: Whitman: One Write-Off Too Far
Walk in with a frown; blame your predecessor; slash projects, budgets, people; lower expectations, loudly; and write off assets.
Julian Sanchez and Ed Felten have been having a fascinating debate about both the technical and social considerations of GMail privacy. Sanchez kicked off the discussion with this Ars Technica article: Op-ed—A plea to Google: Protect our e-mail privacy. Felten observed that there were significant technical hurdles: End-to-End Encrypted GMail? Not So Easy. And Sanchez continued the discussion with some additional thoughts: Encrypting Google: A Quick Reply to Ed Felten.
As both Felten and Sanchez agree, the key issue here is probably not technical, but economic, as the reason that Google provides free email is that they deliver ads based on the content of your email. Sanchez:
As for content ads, well, that’s the million dollar question—and as Vint Cerf has candidly acknowledged, a primary reason Google hasn’t already done this. [...] It’s up to Google’s accountants to figure out how that all nets out, but these considerations seem like a good prima facie reason to at least run the numbers if they haven’t done it recently.
The always interesting Moxie Marlinspike takes a detour from his typical ruminations on security to offer an essay about value: The Worst.
Partisans of the best will probably never end up accidentally riding a freight train 1000 miles in the wrong direction, or making a new life-long friend while panhandling after losing everything in Transnistria, or surreptitiously living under a desk in an office long after their internship has run out — simply because optimizing for the best probably does not leave enough room for those mistakes.
A brilliant article in the December issue of Vanity Fair about the re-modeling of the New York Public Library: Firestorm on Fifth Avenue
The idea that the New York Public Library should not welcome everyone, scholars and casual readers alike, enrages Tony Marx, given how much he has focused his career on making established institutions more open to minorities. It hardly pleases the trustees, either, who have believed consistently in a vision of the library as a progressive institution. In fact, it is something of a paradox that so far as the Central Library Plan is concerned, the blue-blooded trustees represent what might be considered a more progressive view than do the writers and scholars.
Going back to the grand old days of the Amiga 1000, Daniel Cook remembers the work on a game that never was: Game Post Mortem: Hard Vacuum --Mining a 12-year old game design for innovative game mechanics, and talks about the distinctions that can be drawn between the essence of a game, and the execution of a game, and how that should inform the work of game designers:
Pick up and play an ancient copy of a game that fathered a genre. The further back you can go, the better. Clear your mind of all expectations and knowledge of what the genre evolved into. That cool thing that Half Life did with conversations. Forget it.
Now reinvent the genre. What are the core primitive concepts and where can you take them that would result in addictive player experiences? You have the opportunity to reinvent an entire decade of evolutionary game design in your head. Chances are you will spawn a few original ideas.
Lawyers around the world, rejoice: Intellectual Ventures is about to send a whole lot more work your way.
Under the terms of the deal, Kodak will be paid $525 million by a consortium led by Intellectual Ventures and the RPX Corporation. The two companies plan to pay for part of the sale by licensing the intellectual property to 12 other companies.
Wendy Nathers offers some thoughts on the diversity issues that resulted in the cancellation of the British Ruby Conference in a nice essay: Sure, I'll be your unicorn
if we want to hack our brains, we have to be conscious about it, and put some effort into changing our subconscious expectations of the way things ought to be.
Andy Baio reflects on the recent XOXO conference: The Unified Theory of XOXO. Baio's article is a mother lode of innovative and controversial ways to rejuvenate technical conferences, full of great bits like:
Panels are the best way to make four interesting people boring. They're hard to prepare for, too much pressure to be interesting on the spot, and usually end up a meandering and unfocused conversation touching on a handful of topics. I hate them. Even with a strong moderator, I'd universally rather hear four short solo talks from each person than a four-person conversation. Why? Because preparing a solo talk forces the speaker to think carefully about what they want to say, conveying their message and meaning in a concise way.
Wired Magazine agrees, noting that:
XOXO also felt, to me at least, like a defining moment for people who express themselves creatively and independently online, as well as for those who aspire to help them, a moment when that community became aware of itself as a growing, sustainable cultural force, and a moment when it embraced the fact that, unlike in the early days of the web and of the internet, it is now pointedly distinct from the boomtown mentality that seems to characterize so many on the global computer network.
And, lest you think that there isn't enough to learn, Werner Vogels does us all a favor and collects up this year's Back-to-Basics reading list on his blog. Can't wait for next year, Werner!

Tuesday, December 18, 2012

Greedy algorithms and dynamic programming

As I said a few weeks ago, I'm really enjoying Professor Tim Roughgarden's Algorithms: Design and Analysis class.

Much of the material in the first part of the class was review for me, but in this second part it has been mostly new material, things that I should have studied over the years but just hadn't found the time to learn.

Part two of the Algorithms class has been covering Greedy Algorithms and Dynamic Programming. These are both basic, widely-used techniques and have many applications.

One of the things that I'm finding very pleasing about Professor Roughgarden's style is that he mixes theory and practice very well. He starts with very well-chosen, clear examples, works through the examples in detail, then helps you see how to generalize from these now-well-understood examples to the broader, more theoretic underpinnings.

Finally, in each area, he concludes the discussion with a few pointers to more advanced material to continue learning from.

It's a great teaching technique (at least, it makes the material very clear to me), and I'm feeling much more comfortable with the terminology and techniques of these algorithmic approaches.

So little time, so much to learn...

Monday, December 17, 2012

This just in...

It's one of those weeks where I'm in one of those moods, and certain things seem to jump to the foreground.

You know how that is.

Without further ado...

You may have been following the story from the University of Chicago (my alma mater), about the mysterious package received by the Admissions Office, addressed to Henry Walton Jones, Jr., who you of course know better as Indiana Jones.
Little did we know what we were looking at. When our student mail worker snapped out of his finals-tired haze and realized who Dr. Jones was, we were sort of in luck: this package wasn’t meant for a random professor in the Stat department.
Well, it turns out that it's a combination of art, and the above-and-beyond services of the United States Postal Service.
Which doesn't make the story any less wonderful.
Back in the classroom, I think there can be no dispute that one of the best ways to increase enrollment in Computer Science would be if there were more classes that led to launching rockets, or to scouting talent for the Forty Niners.
The specification is for a (fake) launch interceptor system that takes a collection of parameters and a collection of points from a radar, applies a bunch of rules, and outputs a launch / no-launch decision for a missile interceptor.
...
Since the students did an awesome job getting all of this working, we spent the last day of classes launching rockets in a field at the edge of campus; pictures follow.
And lastly, lest you think that the Internet is worthless, and you can never find anything of value on it, may I present Tuning '77
"Tuning '77" - a seamless audio supercut of an entire year of the Grateful Dead tuning their instruments, live on stage. Chronologically sequenced, this remix incorporates every publicly available recording from 1977, examining the divide between audience expectation and performance anxiety.

You're only on this earth for a short while, so don't forget to put on some music, read a good book or watch a good movie, play some games, and tell those you love that you love them.

The Physician: a very short review

One of this fall's read-for-pleasure books for me was Noah Gordon's The Physician. The book was heavily promoted through Amazon's Kindle site; I read it on the Kindle.

The Physician is set 1,000 years ago in Europe, and is the story of Robert Cole, an orphan who is taken up as an apprentice by a "barber-surgeon", who is what passes for medicine in the English countryside.

After a chance encounter Cole discovers that there is much more to be learned, and discovers in himself a desire to learn whatever he can about the real practice of medicine, so he adopts a disguise and travels to Persia to study at the Islamist hospital in Ispahan.

There, Cole finds a world full of knowledge: students and classrooms and books and teachers and lectures and examinations.

Cole is an appealing character, and Gordon is a skillful writer and the book, although 800 pages long, moves right along, and I never was bored or wanted to put the book down.

This book is the first in a trilogy, continuing with Shaman and The Last Jew. I haven't decided whether I'll continue with the others, as I have a long stack of books on my nightstand right now.

But I certainly didn't regret the time I spent with The Physician, so if you're looking for a book to occupy those long winter nights by the fire, consider this one.

USS Alcatraz: A very very short review

Philip Robinson's USS Alcatraz was offered free via Amazon Prime's Kindle Owners Lending Library.

It's sort of a Harlequin Romance for guys, set in Afghanistan, with lots of gruff soldier-talk.

It wasn't the book for me.

(I did say this was going to be a very very short review.)

Saturday, December 15, 2012

Enterprise admins: here's how to help your software be better

Here's a great short list of smart ideas from Bruce Momjian of the Postgres project:

Users must also provide a reliable platform to run database software. Postgres can't maintain high reliability if it is dependent on an unreliable platform. Here are some of the things users can do to provide a reliable platform for Postgres:

Choose an operating system whose focus is on reliability, rather than desktop user experience or supporting the latest hardware or software

Choose hardware designed for 24-hour operation in a demanding environment, not desktop hardware

Use hardware that reports errors, like ecc memory and smart storage

Make sure your storage system performs reliable writes

Have on-site and off-site backups for cases when disasters happen

Educate administrative staff so mistakes don't cause downtime

It's just amazing to me how many times I hear of sites that tell me about their great new server, but have no idea:

What sort of memory they are installing into their machine
How the machine's power consumption and heat output compares with the provisions they've made for power and air conditioning in their machine room
Whether they have a UPS system, and how it's configured, and whether they've tested how their operating system responds to a UPS-initiated power emergency
Whether they know how to read their RAID system's reports, and whether they know what a "disk failed" alarm would look like, and what they'd need to do when one occurred.
what their backup policy is, and when they last ran a full test of restoring from and failing over to their backup.

As Bruce said, don't depend on your server software to solve all these problems for you.

As is the case with systems software, systems administration has evolved considerably over the years. The Usenix organization runs a great annual conference called LISA; the 2012 meeting just wound up, and there's a great set of materials on the LISA web site. If you don't ordinarily pay much attention to system administration, it's worth checking out the LISA conference site periodically, to keep abreast of what the world's sysadmins are spending their time worrying about.

Friday, December 14, 2012

Stuff to read

Another random collection of links; maybe you'll find something interesting in them...

I like when somebody takes a stand on a topic, especially when it's a controversial stand. So even though I'm not 100% in agreement, I enjoyed Richard Rodger's Why I Have Given Up on Coding Standards
Other people are smarter than you. Not some of them. All of them. The coder writing the user interface? They are smarter than you … about the user interface. You’re not writing the code. Why don’t you trust them? No, that’s not the right question. They will still mess up. Why are you making a bigger mess by telling them what to do?
A deep dive in to recent work in the Xen memory manager implementation: Improving block protocol scalability with persistent grants
To avoid having to allocate a very large amounts of shared memory at start, Xen shared ring allocates at start only the minimum amount of memory to be able to queue the requests and responses (that is a single memory page), but not the data itself. The data is allocated using grants, which are memory areas shared on request, and references to those grants are passed along with the request, so the driver domain can map those areas of memory and perform the requested operations.
Nice essay on The Networking Nerd about IPv6 and NAT: The Five Stages of IPv6 and NAT
I’ve been accused of hating NAT. Yes, it’s true. I dislike any protocol that breaks basic connectivity and causes headaches for troubleshooting and end-to-end communications.
I love this blog full of detailed, clearly-explained examples of how to do graphics and animation for games: 2D Game Art for Programmers using inkscape, gimp, & co.
I fell in love with helicopters. They just make great game assets.
Back then it was all pixeled and took what seemed like forever to create. In the game there were a few helicopters with limited variations.
Creating a similar object in vectors allows for easier manipulation and variations. It also makes animations a lot easier and more flexible.
Andrew Hays has a nice photo-essay illustrating the benefits of customizing and tweaking the user interface of your programming tools: Love Your Terminal
You have to stare at your command prompt all the time. You’re constantly typing in commands, and the only useful information that your prompt is giving you is the directory that you’re in. Make your prompt work for you. And make it pretty.
An entire wiki full of contributed articles about how to write an operating system: OSDev.org
This website provides information about the creation of operating systems and serves as a community for those people interested in OS creation
A deep deep deep dive about the details of trying to write a program that tests memory, as part of an overall approach to try to figure out which customer-reported crashes are actually due to hardware problems on the user's computer: Redis Crashes
So what about testing memory just when a crash happens? This would allow Redis to test the memory of every single computer where a crash has happened. The bug report will be annotated with the result of the test, providing an interesting hint about the state of the memory without further help from the user.
And even better, the test will test exactly the memory as allocated at the moment of the crash! It will test exactly the physical memory pages that Redis is using, that is just perfect for environments like EC2.
The problem is, how to do it?
A nice essay about the traps and pitfalls of trying to make a program run faster: The Treacherous Optimization
Put another way, grep sells out its worst case (lots of partial matches) to make the best case (few partial matches) go faster. How treacherous! As this realization dawns on me, the room seemed to grow dim and slip sideways. I look up at the Ultimate Unix Geek, spinning slowly in his padded chair, and I hear his cackle "old age and treachery...", and in his flickering CRT there is a face reflected, but it's my ex girlfriend, and the last thing I see before I black out is a patch of yellow cheese powder inside her long tangled beard.
Still looking for more stuff to read? Don't miss Longreads.org's collection of "best long reads of 2012": guest posts tagged "best of 2012". There are some superb suggestinos in these lists

Thursday, December 13, 2012

Simplifying macroeconomics

I'm just a poor software engineer, so I was confused by all these complex stories about proper banking policy and intermediate treasury notes and pension funding projections.

Happily, a person who understands economics much better than I do has created an infographic which explains it all.

Tuesday, December 11, 2012

Some pre-holidays stuff I'm reading

Here's a random collection of this and that:

SANS Launches NetWars CyberCity to Train Cyber Warriors for Defense. I'm not sure why they feel like they need a physical city in which to train for virtual attacks, but they give their reasoning in the article, including:
NetWars CyberCity participants, which include cyber warriors from the Department of Defense and other defenders within the U.S. Government, will be tasked with protecting the city's critical infrastructure and systems as they come under attack. Cyber warriors will be presented with potential real-world attacks; their job is to defend against them. Missions will include fending off attacks on the city's power company, hospital, water system and transportation services.
The 8th International Conference on emerging Networking EXperiments and Technologies (CoNEXT). Looks like a solid conference, with many interesting papers to follow up on. As seems typical with networking conferences nowadays, the keynote is given by a Ncira employee:
While many of the original assumptions of SDN have turned out to be less relevant than expected, SDN continues to hold its promise for changing the networks as we know them. I argue the change will not materialize through any individual improvement, new feature, or mechanism but through new abstractions and structures further refining our understanding of network architectures. I will discuss two practically proven examples of such new abstractions and structures – network fabrics and network virtualization – and how they together with software forwarding can help to reorganize functionality within networks, making future networks flexible, evolvable and simple to manage.
Part II of Tim Roughgarden's Coursera class: Algorithms: Design and Analysis is superb. I think that partly this is because I'm more interested in the material, but partly it's because Professor Roughgarden is clearly much more comfortable with the technology in this second go-round. His lectures are clear and well-paced, his notes are matched, in all ways it just feels like he's comfortable using the tools to teach.
I love these notes on How to Hack Chipotle
Basically, you could get two whole avocados at the market for the price of one scoop. Consider ordering a burrito bowl, and adding your sliced avocado on top. Ultimately the guacamole decision is a value call and you have to look deep within your heart and wallet to make the decision that’s right for you. NOTE: If you forgo the meat, the guacamole is free.
Tell me, though, doesn't it make you think of that classic: The Ultimate In-N-Out Secret Menu (and Super Secret Menu!) Survival Guide?
Things just got a whole lot more fun. We proceeded to spend the next 15 minutes poring over our options, colluding like '80s kids in a clubhouse trading Garbage Pail Kids, expanding my original list with Thomas' insider information.
Sometimes it turns out that the island isn't there after all: No Land Ho: Sandy Island and the Age of Un-Discovery
A phantom island can be defined as ‘An island once believed to exist, and accordingly depicted on maps, but of which the existence was later disproved, and its cartographic representation removed’.
This looks intriguing: Scribe: The Deterministic Transparent Record/Replay Engine
Deterministic application record and replay is the ability to record application execution and deterministically replay it at a later time.
The ever-alert Raymond Chen reminds us that performance regression tests can provide two very different types of information, both of which are valuable: When studying performance, you need to watch out not only for performance degradation, but also unexpected performance improvement
The second is the one that you don't like: The unexplained improvement. The memory usage activity went down a lot, but you don't remember making any changes that affect your program's memory usage profile. Things got better but you don't know why.
The danger here is that the performance gain may be the result of a bug. Maybe the scenario completed with half the I/O activity because the storage system is ignoring flush requests. Or it completed 15% faster because the cache is returning false cache hits.

Saturday, December 8, 2012

PrairyErth: a very short review

I am reading a most remarkable book.

William Trogdon, who also goes by William Least Heat-Moon, is best known for his book Blue Highways.

I have not read Blue Highways.

I am not reading Blue Highways.

I am reading one of Least Heat-Moon's other books: PrairyErth (A Deep Map): An Epic History of the Tallgrass Prairie Country.

I don't really want to tell you what this book is about, partly because you'll come to the wrong conclusions, and partly because it isn't about what you think it's about.

Here's what Publisher's Weekly said:

Whereas Blue Highways dealt with Heat-Moon's auto trip across America, PrairyErth (an old term for heartland soils) records a journey mostly on foot across the tallgrass prairies and grasslands of Chase County, Kans. In a great cornucopia of a book, a majestic, healing hymn to America's potential, Heat-Moon attempts to penetrate the spirit of the land...

I guess that's actually not too far off.

But let me try two other descriptions:

PrairyErth is an entire book about a single county in rural Kansas.

PrairyErth is the story of a country and a people, and of how the influences of geography, history, culture, biology, and sheer dumb luck left us where we are.

Oh, heck, I don't really want to tell you what this book is "about", because you'll say to yourself: "sounds boring."

And it's not boring, at all.

In fact, I find myself savoring, inching, crawling through the book, a page at a time, enjoying every bit.

Least Heat-Moon calls himself "an inspector of the ordinary", which, I think, is just marvelously appropriate. He is the sort of author who can write an entire page about a hill, or a bush, or a gust of wind, or the doorstep of a house, and you'll find yourself simply enthralled. Here he is, just going for a walk, "in a chamber of absences where the near was the same as the far":

There are several ways not to walk in the prairie, and one of them is with your eye on a far goal, because you then begin to believe you're not closing the distance any more than you would with a mirage. My woodland sense of scale and time didn't fit this country, and I started wondering whether I could reach the summit before dark. On the prairie, distance and the miles of air turn movement to stasis and openness to a wall, a thing as difficult to penetrate as dense forest. I was hiking in a chamber of absences where the near was the same as the far, and it seemed every time I raised a step the earth rotated under me so that my foot fell just where it had lifted from. Limits and markers make travel possible for people: circumscribe our lines of sight and we can really get somewhere.

As it turns out, Least Heat-Moon doesn't need to "really get somewhere," as he is fascinated by where he is:

I'd come into the prairie out of some dim urge to encounter the alien -- it's easier to comprehend where someplace else is than where you are -- and I had begun to encounter it as I moved among the quoins, ledgers, pickled brains, winds, creek meanders, gravestones, stone-age circles. I was coming to see that facts carry a traveler only so far: at last he must penetrate the land by a different means, for to know a place in any real and lasting way is sooner or later to dream it. That's how we come to belong to it in the deepest sense.

(The "pickled brains," by the way, are an exhibit in the county museum.)

At times it may seem like Least Heat-Moon is just wandering, bouncing from topic to topic ("I'm scribbling things down", he says), but it's not like a child with ADD, unable to sustain any interest; rather, he is deeply, deeply interested, and finds himself interested in everything about Chase County, Kansas.

As Least Heat-Moon observes, this was not a likely place for people to settle, and it took effort, and a rather subtle form of deception, to convince them to travel so far, and to such a place; appealing to their sub-conscious urges was necessary:

Of the fifteen longitudinal streets, ten have the names of trees, and none of a prairie grass or native forb or legume. These names are fossil history of attitudes that town promoters employed to attract settlers from the eastern woodlands, people who had experienced little to prepare them for this big grassland, even the pioneers from the smaller and wetter prairies of Illinois and Indiana. The homesteaders brought with them a notion corroborated by their Christianity that this hugely open spread was a kind of failed forest that needed only the hand of civilized man to redeem it from its appalling waste, and they reversed here their usual practice of axing wilderness: they planted trees to remove it. Rather than learning what the prairie could provide and then changing their ways to harmonize with a land new to them, the settlers began trying to remake it into the East.

Hopefully I have whetted your appetite for this delightful book, and for Least Heat-Moon's insight, his clarity, and his superb phrasings and poetic voice.

You may think that you care nothing about Chase County, Kansas, and wonder why it would be worth the time to learn about it; I can only hope that this brief taste gives you the motivation to give PrairyErth a try.

The Right Stuff

Check out this feature in USA Today: Chuck Yeager, still soaring at 89.

"I'll be 90 in February, and while I'm not gonna run no marathon I still hunt and fish and fly," says Yeager, resting in the shade of a tail-dragger prop plane that he solos in regularly.

I love this:

"What good does it do to be afraid? It doesn't help anything," Yeager says. "You better try and figure out what's happening and correct it."

Still an inspiration, at age 89.

Friday, December 7, 2012

Jones v Anand

If you think grandmaster chess is boring, with lots of positional strategic maneuvering, you owe it to yourself to play through the game between World Champion Viswanathan Anand and England's Gawain Jones in round 5 of the London Chess Classic.

You can find the game here: Round 5 Games.

The Hindu explains a bit about what's going on: Anand tames Gawain Jones in London Chess

Surprisingly, Jones decided to give Anand a taste of his own medicine. The Samisch system, the Grunfeld defence has been deeply analysed by the world champion for the World Championship match against Gelfand earlier this year and Jones chose the same as white.
Anand went for a complicated variation and Jones was at sea right from the early stages of middle game. The Indian ace had no troubles eating the material that came his way and by move 20 everything was in control. Jones resigned after 29 moves.

I'll say it was a complicated variation! On move 12, Anand's knight moves quietly to f6, seemingly just to protect his king from the pawn advance of Jones. But three moves later the knight advances, apparently all alone, a lone soldier against the massive White forces.

Then two moves later out comes Anand's queen, and it becomes clear that all of Black's pieces are in perfect position, and in just a few more moves White is reduced to nothing.

Meanwhile, as The Hindu observes, Carlsen continues his phenomenal pace in this tournament:

The victory took Carlsen to an astonishing thirteen points from four games in the soccer-like scoring system in place here. Kramnik remains on the toes of the leader with eleven points in his kitty and the rest of the field is now far behind.

But don't count Nakamura out! He won again, too!

It's brilliant entertaining chess. Don't miss it!

Tuesday, December 4, 2012

Switching off the net

At times I wish I had my own switch to turn off the Internet...

But more seriously, last week's episode in Syria is interesting: was Syria off the Internet? How did it happen? Who caused it to happen? Why was that decision made, and why was it subsequently reversed?

Two interesting blogs have more details:

The Cloudflare blog focuses on the practical aspects of what happened and when:
- How Syria Turned Off the Internet
  The following network providers typically provide connectivity from Syria to the rest of the Internet: PCCW and Turk Telekom as the primary providers with Telecom Italia and TATA for additional capacity. When the outage happened, the BGP routes to Syrian IP space were all simultaneously withdrawn from all of Syria's upstream providers. The effect of this is that networks were unable to route traffic to Syrian IP space, effectively cutting the country off the Internet.
- Syrian Internet access reestablished starting 1432 UTC
  Traffic to the CloudFlare network from Syrian IP addresses appears to have returned to levels seen prior to the shutdown. Almost immediately after the first links were reestablished we saw traffic levels jump back up.
The Renesys blog takes a look at the broader implications for other regions of the world: Could It Happen In Your Country?
In some countries, international access to data and telecommunications services is heavily regulated. There may be only one or two companies who hold official licenses to carry voice and Internet traffic to and from the outside world, and they are required by law to mediate access for everyone else.
Renesys identify 61 countries where this situation is the most clearly-defined:
If you have only 1 or 2 companies at your international frontier, we classify your country as being at severe risk of Internet disconnection. Those 61 countries include places like Syria, Tunisia, Algeria, Turkmenistan, Libya, Ethiopia, Uzbekistan, Myanmar, and Yemen.
Of the 61 countries they identified, most are tiny island nations like Saint Pierre and Miquelon; in such cases both geography and country size mean that the presence of only 1 or 2 cross-border Internet gateways is little surprise, and (to me) does not necessarily correlate with any particular policy choices.

Meanwhile, since we're talking about duh Netz, I enjoyed this little rant from the Internet Governance Project's blog: Three Little ICANN Atrocities That Make the ITU Look Good By Comparison.

Fussing about the threat to the Internet posed by the International Telecommunication Union (ITU) is reaching that state of critical mass where media outlets write about it mainly because other media outlets are writing about it. The tacit assumption behind much of this fussing is that the status quo, exemplified by ICANN and other “multi-stakeholder institutions,” is doing a wonderful job and we should strive to preserve them.
But the status quo is not so wonderful.

The world changes, but the basic issues remain:

"Our liberty cannot be guarded but by the freedom of the press, nor that be limited without danger of losing it." --Thomas Jefferson to John Jay, 1786.

Saturday, December 1, 2012

Here comes Naka!

All eyes are focused on Magnus Carlsen as the London Chess Classic kicks off this weekend. Carlsen is currently just four rating points away from Garry Kasparov's all-time rating peak of 2851, and with Aronian, Kramnik, and Anand in the draw, Carlsen has some big opportunities to pick up those final few points that all have expected for so long.

But don't overlook Hikaru Nakamura. Although he's had a few ups and downs this fall, 2012 has been a wonderful year for Naka, with some tremendous victories.

The first round is now complete, and from all appearances this will be a fighting tournament: all 4 first round games were decisive, and in three of the four games the player with the black pieces was the victor.

Nakamura's victory, with black, over world #2 Levon Aronian, was just superb.

Is he poised for a breakout tournament, under the lights, with all the world's best in attendance?