Tuesday, August 26, 2014

Back to driving on the right side

We're back from a two week trip to Ireland and England.

I'll try to put up some pictures and thoughts over the next few days.

But right now ...

... must ...

... sleep


Tuesday, August 12, 2014

Kasparov fails in FIDE election

The world of professional chess is again in turmoil.

The results of the FIDE presidency election are in, and once again Kirsan Ilyumzhinov has been reelected.

Now both Karpov and Kasparov have failed to unseat Ilyumzhinov.

What will happen next?

I have no idea.

Monday, August 11, 2014

ShadowRun Returns: DragonFall: a very short review

Last spring I talked briefly about ShadowRun Returns.

I had a few spare hours over the summer, so I picked up ShadowRun Returns: DragonFall.

Yes, just as the other reviewers have said, DragonFall is even better!

Better writing, better characters, more interesting tactics, great locations.

These are fun games!

I hope the folk over at HareBrained Schemes can keep up the great work.

I may have to try playing StrikeFleet Omega on my wife's iPad one of these days...

Sunday, August 10, 2014

Fun Home: a very short review

Somewhat accidentally, I found myself reading Alison Bechdel's Fun Home: A Family Tragicomic.

Let me see if I can set the stage for you a little bit.

It's a graphic novel.

About childhood.

Called Fun Home.

Nope, you're wrong: whatever you're thinking, you're wrong.

The Fun Home of the title was the funeral home where Bechdel's father worked, and her story is anything but a gentle reminiscence of her peaceful childhood days.

Bechdel more-or-less takes three swings at describing her childhood: once as she saw it as a child, growing up; once as she revisited it once she was an adult and had moved away; and once with the benefit of time and reflection. All three viewpoints are intertwined and interleaved: she dances around, describing the same events and observations from different angles and distances.

The book is beautifully written and drawn, and the story is fascinating. Bechdel is a lively and literate author, and I found her literary allusions, her cultural observations, and her autobiographic reflections to be compelling, even riveting.

But the tale she tells is raw and heartbreaking, it is indeed a tragic story from a tragic time.

I'm glad I read it, although I guess I should be a little bit more careful what books I "stumble into," because here, to strain the old proverb, you certainly can't tell a book from its cover.

Saturday, August 9, 2014

git rebase

I've been studying the heck out of git rebase recently.

I mean, I've really been making an effort. It's no exaggeration to say that I've spent the last 3 months of my life having the primary focus of my (professional) life being to really, deeply, truly understand git rebase.

This is no trivial task. To start with, about all you can say about the documentation for git rebase is that it's a disaster. For a piece of software as powerful, flexible, and astonishingly amazing as git rebase, its official documentation is a mirror image of that. It's more than just badly written and misleading; it's almost as though it actively defies your attempt to understand.


With any piece of truly sophisticated software, I've found that there are always three levels of understanding:

  1. What does it do?
  2. How does it do that?
  3. WHY would you do that?

I think that software shares this with many other human creations, like cooking, or home repair, or jet aircraft design. At one level, there are recipes, and you can learn to follow the recipes, and make a nice Creme Brulee: now you know what the recipe does.

And then, you can study some more, and you can learn how the recipe works: that you cook the custard in a water bath to insulate the cream-egg mixture from the oven's heat and prevent the eggs from cooking too fast, and that you carmelize the sugar not just to change its taste, but also to provide a different dimension to the dessert by forming a hard shell over the smooth creamy custard below.

But you still don't know why you should use this recipe, when it is the right thing to do, and when it is not.

Well, anyway, I don't want to talk about cooking, because I'm an engineer at heart, not a chef.

Rather, I want to talk about git rebase.

And how hard it is to achieve that third level of understanding.

Because, even though the documentation is horrific (did I mention that already?), you can fairly quickly pick up the ideas about what you can do with git rebase:

  • Bring a branch up to date with its parent
  • Re-arrange the work in a branch, perhaps splitting it into several branches, or re-parenting it on a different parent branch
  • Revise the work in a branch, perhaps eliminating some of the work, excising some of the clutter left by mis-steps or dead ends or false summits, or re-ordering it, or collapsing and removing some of the intermediate steps

And, although the documentation doesn't help with this, you can, with a bit more study, understand how git rebase accomplishes these tasks.

To get a feel for how git rebase does what it does, let me recommend these resources:

  1. the Git Rebasing chapter in Scott Chacon's book
    In this section you’ll learn what rebasing is, how to do it, why it’s a pretty amazing tool, and in what cases you won’t want to use it.
  2. Git for Computer Scientists
    Quick introduction to git internals for people who are not scared by words like Directed Acyclic Graph.
  3. Git from the bottom up
    This state of affairs most directly represents what we’d like done: for our local, development branch Z to be based on the latest work in the main branch D. That’s why the command is called “rebase”, because it changes the base commit of the branch it’s run from.

Of course, there are many more resources for git, but these are among my favorites.

So after some amount of time reading, and experimenting, and thinking, you will find that you can understand what git rebase does, and how it does it.

But why? Why would you choose to forward-port commits? Why would you choose to re-order, re-word, squash, split, or fixup commits? What is the purpose of this incredibly powerful tool, and how are you ever going to understand how to use it properly, and not instead shy away from it in complete terror?

Unfortunately, I can no longer recall where I stumbled across this resource, but somehow I found Norman Yarvin's web site, where he has collected random collections of stuff.

And, one of those collections is an amazing series of email messages from Linus Torvalds (and a few others): git rebase.

Now, this is not easy stuff to read. Don't just plunge into it, until you've gone through the other resources first.

But when you're ready, and you've done your homework, go and read what Linus has to say about when and why you should use rebase, and think deeply about passages like:

But if you do a true merge, the bug is clearly in the merge (automatedly clean or not), and the blame is there too. IOW, you can blame me for screwing up. Now, I will say "oh, me bad, I didn't realize how subtle the interaction was", so it's not like I'll be all that contrite, but at least it's obvious where the blame lies.

In contrast, when you rebase, the same problem happens, but now a totally innocent commit is blamed just because it happened to no longer work in the location it was not tested in. The person who wrote that commit, the people who tested it and said it works, all that work is now basically worthless: the testing was done with another version, the original patch is bad, and the history and _reason_ for it being bad has been lost.

And there's literally nothing left to indicate the fact that the patch and the testing _used_ to be perfectly valid.

That may not sound like such a big deal, but what does that make of code review and tested-by, and the like? It just makes a mockery of trying to do a good job testing any sub-trees, when you know that eventually it will all quite possibly be pointless, and the fact that maybe the networking tree was tested exhaustively is all totally moot, because in the end the stuff that hit the main tree is something else altogether?

Don't get me wrong at all. Rebasing is fine for stuff you have committed yourself (which I assume was the case here).

Rebasing is also a fine conflict resolution strategy when you try to basically turn a "big and complex one-time merge conflict" into "multiple much smaller ones by doing them one commit at a time".

But what rebasing is _not_ is a fine "default strategy", especially if other people are depending on you.

What I do try to encourage is for people to think publicising their git trees as "version announcements". They're obviously _development_ versions, but they're still real versions, and before you publicize them you should try to make sure that they make sense and are something you can stand behind.

And once you've publicized them, you don't know who has that tree, so just from a sanity and debugging standpoint, you should try to avoid mucking with already-public versions. If you made a mistake, add a patch on top to fix it (and announce the new state), but generally try to not "hide" the fact that the state has changed.

But it's not a hard rule. Sometimes simple cleanliness means that you can decide to go "oops, that was *really* wrong, let's just throw that away and do a whole new set of patches". But it should be something rare - not normal coding practice.

Because if it becomes normal coding practice, now people cannot work with you sanely any more (ie some random person pulls your tree for testing, and then I pull it at some other time, and the tester reports a problem, but now the commits he is talking about don't actually even exist in my tree any more, and it's all really messy!).

Rebasing branches is absolutely not a bad thing for individual developers.

But it *is* a bad thing for a subsystem maintainer.

So I would heartily recommend that if you're a "random developer" and you're never going to have anybody really pull from you and you *definitely* don't want to pull from other peoples (except the ones that you consider to be "strictly upstream" from you!), then you should often plan on keeping your own set of patches as a nice linear regression.

And the best way to do that is very much by rebasing them.

That is, for example, what I do myself with all my git patches, since in git I'm not the maintainer, but instead send out my changes as emails to the git mailing list and to Junio.

So for that end-point-developer situation "git rebase" is absolutely the right thing to do. You can keep your patches nicely up-to-date and always at the top of your history, and basically use git as an efficient patch-queue manager that remembers *your* patches, while at the same time making it possible to efficiently synchronize with a distributed up-stream maintainer.

So doing "git fetch + git rebase" is *wonderful* if all you keep track of is your own patches, and nobody else ever cares until they get merged into somebody elses tree (and quite often, sending the patches by email is a common situation for this kind of workflow, rather than actually doing git merges at all!)

So I think 'git rebase' has been a great tool, and is absolutely worth knowing and using.

*BUT*. And this is a pretty big 'but'.

BUT if you're a subsystem maintainer, and other people are supposed to be able to pull from you, and you're supposed to merge other peoples work, then rebasing is a *horrible* workflow.


It's horrible for multiple reasons. The primary one being because nobody else can depend on your work any more. It can change at any point in time, so nobody but a temporary tree (like your "linux-next release of the day" or "-mm of the week" thing) can really pull from you sanely. Because each time you do a rebase, you'll pull the rug from under them, and they have to re-do everything they did last time they tried to track your work.

But there's a secondary reason, which is more indirect, but despite that perhaps even more important, at least in the long run.

If you are a top-level maintainer or an active subsystem, like Ingo or Thomas are, you are a pretty central person. That means that you'd better be working on the *assumption* that you personally aren't actually going to do most of the actual coding (at least not in the long run), but that your work is to try to vet and merge other peoples patches rather than primarily to write them yourself.

And that in turn means that you're basically where I am, and where I was before BK, and that should tell you something. I think a lot of people are a lot happier with how I can take their work these days than they were six+ years ago.

So you can either try to drink from the firehose and inevitably be bitched about because you're holding something up or not giving something the attention it deserves, or you can try to make sure that you can let others help you. And you'd better select the "let other people help you", because otherwise you _will_ burn out. It's not a matter of "if", but of "when".

Now, this isn't a big issue for some subsystems. If you're working in a pretty isolated area, and you get perhaps one or two patches on average per day, you can happily basically work like a patch-queue, and then other peoples patches aren't actually all that different from your own patches, and you can basically just rebase and work everything by emailing patches around. Big deal.

But for something like the whole x86 architecture, that's not what te situation is. The x86 merge isn't "one or two patches per day". It easily gets a thousand commits or more per release. That's a LOT. It's not quite as much as the networking layer (counting drivers and general networking combined), but it's in that kind of ballpark.

And when you're in that kind of ballpark, you should at least think of yourself as being where I was six+ years ago before BK. You should really seriously try to make sure that you are *not* the single point of failure, and you should plan on doing git merges.

And that absolutely *requires* that you not rebase. If you rebase, the people down-stream from you cannot effectively work with your git tree directly, and you cannot merge their work and then rebase without SCREWING UP their work.

The PCI tree merged the suspend branch from the ACPI tree. You can see it by looking at the PCI merge in gitk:

 gitk dc7c65db^..dc7c65db
and roughly in the middle there you'll find Jesse's commit 53eb2fbe, in which he merges branch 'suspend' from Len's ACPI tree.

So Jesse got these three commits:

 0e6859d... ACPI PM: Remove obsolete Toshiba workaround
 8d2bdf4... PCI ACPI: Drop the second argument of platform_pci_choose_state
 0616678... ACPI PM: acpi_pm_device_sleep_state() cleanup

from Len's tree. Then look at these three commits that I got when I actually merged from you:

 741438b... ACPI PM: Remove obsolete Toshiba workaround
 a80a6da... PCI ACPI: Drop the second argument of platform_pci_choose_state
 2fe2de5... ACPI PM: acpi_pm_device_sleep_state() cleanup

Look familiar? It's the same patches - just different commit ID's. You rebased and moved them around, so they're not really the "same" at all, and they don't show the shared history any more, and the fact that they were pulled earlier into the PCI tree (and then into mine).

This is what rebasing causes.

So rebasing and cleanups may indeed result in a "simpler" history, but it only look that way if you then ignore all the _other_ "simpler" histories. So anybody who rebases basically creates not just one simple history, but a _many_ "simple" histories, and in doing so actually creates a potentially much bigger mess than he started out with!

As long as you never _ever_ expose your rewriting of history to anybody else, people won't notice or care, because you basically guarantee that nobody can ever see all those _other_ "simpler" histories, and they only see the one final result. That's why 'rebase' is useful for private histories.

But even then, any testing you did in your private tree is now suspect, because that testing was done with the old history that you threw away. So even if you delete all the old histories and never show them, they kind of do exist conceptually - they existed in the sense that you tested them, and you've just hidden the fact that what you release is different from what you tested.


That was a lot of quoting, and I'm sorry to do that.

But so many of the web pages out there only point to Linus's Final Word On The Subject.

You know, the one which reads:

I want clean history, but that really means (a) clean and (b) history.
Now, that last essay is indeed brilliant, and you should print it out, and post it on your wall, and read it every morning, and think about what it is he's trying to say.

But if you just can't figure it out, well, go digging in the source material.

And then, I believe, it will finally all make sense.

At least, it finally did, to me.

Friday, August 8, 2014

Stuff I'm reading, lazy Friday afternoon edition

So here we are in the dog days of summer. Woof!

As usual, doncha know, I'm all over the place:

  • Too bad everything seems to be a "freemium" game nowadays. Godus: Another Baffling, Bizarre Peter Molyneux Game
    Speaking of light touch, the most interesting thing about Godus is that you’re actually touching your world and peoples instead of wielding a mouse pointer. There’s something about delicately tapping the landscape, even through a glass veneer, that personalizes the experience. If keyboard and mouse exemplify media theorist Marshall McLuhan’s notion that inanimate objects become extensions of ourselves, Godus is partly about removing that extension: You’re god, after all, and god doesn’t play dice with a gamepad.
  • Planet Generation - Part I
    For this chapter I will start by making the simple geometry for the basis of the planet. At later posts I will add more detail including height data, lighting, atmospheric scattering and level of detail which will allow the amount of data the planet contains to increase dramatically.
  • Planet Generation - Part II
    Libnoise is a portable, open-source, coherent noise-generating library for C++. This sounds good for my needs as I am running the code on my Nexus 5 Android phone and the library supports a SetSeed function so it fills the requirement of being able to replicate the same results given a seed.
  • This is a really super essay about the role of plot and narrative in video game implementations: Designing game narrative
    Here’s an example from the first Portal game. In this game, you play as a test subject with a portal gun, trying to advance through different test chambers. Near the end, you are riding a slowly moving platform to what you are told is a reward for your good test performance. Suddenly, it’s revealed that the platform is actually taking you to a fiery death. When I was playing this scene, I genuinely panicked: I was deeply immersed in the game at this point, feeling good about myself for beating the puzzles, ready to be rewarded for it, and now I was being betrayed. Without thinking, my eyes lead me to an ideal surface for firing my portal gun, and I created an exit for myself, escaping certain death. For just a moment, I genuinely thought I broke the system. I had outsmarted the enemy with my wits!
  • Announcing UberPool, Carpooling with Uber
    This is also a bold social experiment. There’s the interaction between riders in an UberPool—should they talk to each other? When is that cool and when is it, well, annoying? We’re going to find out how this brave new world of UberPooling works—we’ll iterate on this beta product and get it right, because the larger social implications of reducing the number of cars on the road, congestion in cities, pollution, parking challenges… are truly inspiring.
  • The BobbyTables Culture
    Countless posts on Stack Overflow are vulnerable to SQL injection attacks. Along with several other users, I always raise this when it shows up – this is something that really just shouldn’t happen these days. It’s a well-understood issue,and parameterized SQL is a great solution in almost all cases. (No, it doesn’t work if you want to specify an column or table name dynamically. Yes, whitelisting is the solution there.)
  • Your computer is already a distributed system. Why isn’t your OS?
    Our goal is to make it easier to design and construct ro- bust OSes that effectively exploit heterogeneous, multi- core hardware at scale. We approach this through a new OS architecture resembling a distributed system.
  • The Network is Reliable
    much of what we believe about the failure modes of real-world distributed systems is founded on guesswork and rumor. Sysadmins and developers will swap stories over beer, but detailed, public postmortems and comprehensive surveys of network availability are few and far between. In this article, we'd like to informally bring a few of these stories (which, in most cases, are unabashedly anecdotal) together. Our focus is on descriptions of actual network behavior when possible and (more often), when not, on the implications of network failures and asynchrony for real-world systems deployments. We believe this is a first step toward a more open and honest discussion of real-world partition behavior, and, ultimately, toward more robust distributed systems design.
  • The US Intelligence Community has a Third Leaker
    Everyone's miscounting.
  • Building Carousel, Part II: Speeding Up the Data Model
    In Carousel, we wanted to fix the experience for users with more than a single page of photos by using a different model for loading photo metadata off of disk. When it comes time to render the view to the user, it doesn’t matter how we store the user’s data on disk as long as we can quickly bind data to the view. We prepare an in-memory “view model”, a data structure that provides exactly this: an index on our metadata that is fast enough to query on the UI thread at render time.
  • Open Source Dickishness
    In a blog post, StrongLoop announced the move as a great next step in the evolution of the project. The blog post masks a commercial transaction as an act of good will by calling it a “transfer of sponsorship”. If all they wanted was to “pitch in and help”, why did they need to take over and move the project? Why is their first public act a blog post and not a pull request?
  • StrongLoop & Express
    Did I consult every contributor that there has ever been on the project? No, maybe I should have? Ultimately I don’t see this as the huge problem that everyone else does, the two primary contributors benefit, the community benefits by having full-time employees improve a project, the company benefits from being closely associated with the project.
  • Non-Transparent Memory Safety
    after using Deputy for a while, its genius became apparent. First, whenever I needed to tell Deputy something, the information was always available either in my head or in a convenient program variable. This is not a coincidence: if the information that Deputy requires is not available, then the code is probably not memory safe. Second, the annotations become incredibly useful documentation: they take memory safety information that is normally implicit and put it out in the open in a nice readable format. In contrast, a transparent memory safety solution is highly valuable at runtime but does not contribute to the understandability and maintainability of our code.
  • Princeton likely to rescind grade deflation policy
    After the policy went into effect in 2005, grades were flat for a few years and then started rising again. So what changed grading practices was not the 35% guideline but the simple fact that faculty were discussing and thinking more deeply about grading policy during the period before the current policy was even a concrete proposal. The policy that worked was “grade mindfully”, not “give 35% A’s”.
  • Silver Village
    Historic structures in the mountains of California are being wrapped, Christo-style, in reflective silver sheets to help protect them against the heat of wildfires.
  • Today is a good day. I just had a call from a telemarketer.
    Like a good IT administrator I put my skills to use for their benefit.
  • Over a Billion Passwords Stolen?
    I've been doing way too many media interviews over this weird New York Times story that a Russian criminal gang has stolen over 1.2 billion passwords.
  • Bruce Schneier is skeptical, but Brian Krebs says it's legit: Q&A on the Reported Theft of 1.2B Email Accounts
    Alex isn’t keen on disclosing his methods, but I have seen his research and data firsthand and can say it’s definitely for real. Without spilling his secrets or methods, it is clear that he has a first-hand view on the day-to-day activities of some very active organized cybercrime networks and actors.
  • The hidden perils of cookie syncing
    The most common use of cookie syncing is to enable real-time bidding between several entities in an ad auction. It allows the bidder and the ad network to refer to the user by the same ID so that the bidder can place bids on a particular user in current and future auctions. Cookie syncing raises subtle yet serious privacy concerns, but due to the technical complexity of explaining it, didn’t receive much press coverage. In this post I’ll explain cookie syncing and why it’s worrisome — even more so than canvas fingerprinting.
  • A Closer Look At Personas: What They Are And How They Work (Part 1)
    A persona is a way to model, summarize and communicate research about people who have been observed or researched in some way. A persona is depicted as a specific person but is not a real individual; rather, it is synthesized from observations of many people. Each persona represents a significant portion of people in the real world and enables the designer to focus on a manageable and memorable cast of characters, instead of focusing on thousands of individuals.
  • PostgreSQL page size for SSD
    What struck me is that there is a significant impact of smaller page size on OLTP performance (6% better for 4 kB, 10% better for 2 kB, with 8 kB as the reference), suggesting that the 8 kB default is not necessarily the best choice, especially when using SSD.
  • Explaining Ark Part 4: Fixing Majority Write Concern
    When a replica set has two primaries, the two primaries should never produce oplog entries whose positions interleave, and primary that produces smaller oplog positions should step down.
  • Everything We Know About Facebook's Secret Mood Manipulation Experiment
    For one week in January 2012, data scientists skewed what almost 700,000 Facebook users saw when they logged into its service. Some people were shown content with a preponderance of happy and positive words; some were shown content analyzed as sadder than average. And when the week was over, these manipulated users were more likely to post either especially positive or negative words themselves.
  • Did OkCupid send a bunch of incompatible people on dates on purpose?
    The question is whether the potential damage of the interventions justifies the lack of informed consent outside of OkCupid's normal terms and conditions. The first experiment is the least troubling on these grounds, since everyone was informed the change was coming, rather than it being rolled out surreptitiously
  • We Experiment On Human Beings!
    We noticed recently that people didn’t like it when Facebook “experimented” with their news feed. Even the FTC is getting involved. But guess what, everybody: if you use the Internet, you’re the subject of hundreds of experiments at any given time, on every site. That’s how websites work.
  • Why don't OKCupid's experiments bother us like Facebook's did? 
    The tone of the OKC post is just so darned charming. Rudder is casual, self-deprecating. It's a blog post! Meanwhile, Facebook's "emotional contagion" scholarly paper was chillingly matter-of-fact. In short, the scientism of the thing just creeped us the fuck out.
  • Premier League preview links
    It’s a week until the season starts, so here are some club by club preview links
  • If Google was a Guy

The Hannibal Procedure

I'm following up on a question I wrote about a week ago: What are Halachic Considerations?.

A fascinating and intense and disturbing story in the New York Times sheds some light: Israeli Procedure Reignites Old Debate.

It was one of the rare invocations of the Israeli military’s “Hannibal procedure,” one of its most dreaded and contentious directives, which allows commanders to call in extra troops and air support to use maximum force to recapture a lost soldier. Its most ominous clause states that the mission is to prevent the captors from getting away with their captives, even at the risk of harming or endangering the lives of the captured Israeli soldiers.

It has been official procedure of the Israeli military for decades:

The Hannibal edict was drawn up by three senior officers in Israel’s northern command in the 1980s after two Israeli soldiers were captured by Hezbollah in Lebanon.

But it doesn't fully explain the recent event:

There was no contact or engagement between the soldiers who entered the tunnel and the captors, Colonel Lerner said. But he said some evidence found in the tunnel later helped the military determine that Lieutenant Goldin could not have survived the initial attack. He was declared killed in action by late Saturday night.

I know more than I did a week ago, perhaps less than I could, but perhaps as much as I should, or need.

And I appreciate the journalist who helped me understand things a bit better.