Wednesday, March 4, 2015

Sorry, but now we know: you're a dog.

Twenty years ago, the New Yorker published Peter Steiner's wonderful send-up of the Internet, perhaps the greatest Internet cartoon ever written: On the Internet, nobody knows you're a dog

Well, twenty years have passed.

In his talk at this year's GDC, Raph Koster picks up the story, and Gamasutra's Leigh Alexander summarizes it for us

We now live in an age where the internet filters results for you based on assumptions about what you're like drawn from geographic location or other patterns. This creates a phenomenon called a "filter bubble," says Koster, where increasingly one's perception of the world is led by online targeting. Your average online user will increasingly see only those news sources and even political candidates that agree with their own views -- and anyone who's ever Facebook-blocked a relative with offensive political views has become complicit in this sort of filtering.


Without clear lines and well-bounded communities, people can become confused in a way that leads to conflict. For example, with Kickstarter and Early Access games, users become hostile and controversies arise because the distinction between a "customer" and a "funder", a creator and a user, is indistinct; confusion about different types of roles and relationships within the game industry gave rise to last year's "controversy that shall not be named."

When you look at modern communities, from Twitter and Reddit to Facebook and chan boards, all the best practices -- keeping identity persistent, having a meaningful barrier to entry, specific roles, groups well-bordered and full anonymity at least somewhat difficult -- have been thrown out the window, giving rise to toxicity online, the veterans say.

As Koster notes on his blog, the discussion continues across the gaming community.

But it's not just gaming. For that, we turn to the always-fascinating, if complex, Don Marti, with his latest note: Digital dimes in St. Louis.

Don points to Jason Kint's recent article: Unbridled Tracking and Ad Blocking: Connect the Dots.

As part of my presentation, I shared a couple of charts that caused a bit of a stir among the audience of media execs charged with leading their organizations digital media ad sales businesses. The fact that these particular slides triggered such a reaction struck me as particularly timely because later that day the White House released its proposal for a Consumer Privacy Bill of Rights, which would require companies to clearly communicate their data practices to consumers and give them more control over what information is collected and how it is used.

Don wonders what all this tracking technology is destroying:

So how do we keep the local papers, the people who are doing the hard nation-protecting work of Journalism, going?

Where is all this rooted? Koster suggests that it's that gruesome and awful creation of technology, the filter bubble that Eli Pariser identified some five years ago:

For example, on Google, most people assume that if you search for BP, you’ll get one set of results that are the consensus set of results in Google. Actually, that isn’t true anymore. Since Dec. 4, 2009, Google has been personalized for everyone. So when I had two friends this spring Google “BP,” one of them got a set of links that was about investment opportunities in BP. The other one got information about the oil spill. Presumably that was based on the kinds of searches that they had done in the past. If you have Google doing that, and you have Yahoo doing that, and you have Facebook doing that, and you have all of the top sites on the Web customizing themselves to you, then your information environment starts to look very different from anyone else’s. And that’s what I’m calling the “filter bubble”: that personal ecosystem of information that’s been catered by these algorithms to who they think you are.

It's all terribly complex; it's not precisely clear where the slippery slope began, and how we crossed the line.

But the first step to getting out of the hole is to stop digging.

And to do that, you have to have the discussion; you have to know you're in the hole.

Thank you, Messrs Koster and Marti and Kint and Pariser. It's not a happy story to shout to the world, but you must keep telling it.

Saturday, February 28, 2015

Some long-form articles worth your time

Nothing connects these articles, other than that they're all interesting (to me, that is).

And they're all long.

  • Invasion of the Hedge Fund Almonds
    Our increasing fondness for nuts—along with a $28-million-a-year marketing campaign by the Almond Board of California—are part of what has prompted the almond boom. But the main driver comes from abroad. Nearly 70 percent of California's almond crop is exported, with China the leading customer: Between 2007 and 2013, US almond exports to China and Hong Kong more than quadrupled, feeding a growing middle class' appetite for high-protein, healthy food. Almonds now rank as the No. 1 US specialty crop export, beating wine by a count of $3.4 billion to $1.3 billion in 2012. (Walnuts and pistachios hold the third and fourth spots, each bringing in more than $1 billion in foreign sales.) As a result, wholesale almond prices jumped 78 percent between 2008 and 2012, even as production expanded 16 percent.

    According to UC-Davis' Howitt, the shift to almonds and other tree nuts is part of a long-term trend in California, the nation's top agricultural state. Farmers in the Central Valley once grew mostly wheat and cattle. But over time, they have gravitated toward more-lucrative crops that take advantage of the region's rare climate. "It's a normal, natural process driven by market demand," Howitt says. "We grow the stuff that people buy more of when they have more money." Like nuts, which can replace low-margin products such as cotton, corn, or beef.

  • How crazy am I to think I actually know where that Malaysia Airlines plane is?
    Meanwhile, a core of engineers and scientists had split off via group email and included me. We called ourselves the Independent Group,11 or IG. If you found yourself wondering how a satellite with geosynchronous orbit responds to a shortage of hydrazine, all you had to do was ask.12 The IG’s first big break came in late May, when the Malaysians finally released the raw Inmarsat data. By combining the data with other reliable information, we were able to put together a time line of the plane’s final hours: Forty minutes after the plane took off from Kuala Lumpur, MH370 went electronically dark. For about an hour after that, the plane was tracked on radar following a zigzag course and traveling fast. Then it disappeared from military radar. Three minutes later, the communications system logged back onto the satellite. This was a major revelation. It hadn’t stayed connected, as we’d always assumed. This event corresponded with the first satellite ping. Over the course of the next six hours, the plane generated six more handshakes as it moved away from the satellite.
  • Proving that Android’s, Java’s and Python’s sorting algorithm is broken (and showing how to fix it)
    After we had successfully verified Counting and Radix sort implementations in Java (J. Autom. Reasoning 53(2), 129-139) with a formal verification tool called KeY, we were looking for a new challenge. TimSort seemed to fit the bill, as it is rather complex and widely used. Unfortunately, we weren’t able to prove its correctness. A closer analysis showed that this was, quite simply, because TimSort was broken and our theoretical considerations finally led us to a path towards finding the bug (interestingly, that bug appears already in the Python implementation). This blog post shows how we did it.
  • Mastering Git submodules
    Submodules are hair-pulling for sure, what with their host of pitfalls and traps lurking around most use cases. Still, they are not without merits, if you know how to handle them.

    In this post, we’ll dive deep into Git submodules, starting by making sure they’re the right tool for the job, then going through every standard use case, step by step, so as to illustrate best practices.

  • Mastering Git subtrees
    A month ago we were exploring Git submodules; I told you then our next in-depth article would be about subtrees, which are the main alternative.

    As before, we’ll dive deep and perform every common use-case step by step to illustrate best practices.

Friday, February 27, 2015

Rough news for the Derby project

Nearly 20 years ago, when Java was just emerging as an exciting new programming language, a small software company named "Cloudscape" was started up to build database software in Java.

Although I never worked at Cloudscape, their offices were only a couple blocks from my office, and I knew a number of the principal engineers very well.

Cloudscape assembled a superb engineering team and built a powerful product, but struggled to find commercial success as the "Dot Com Bubble" burst around the end of the 1990's. In 1999, Cloudscape was acquired by Informix, and in 2001 Informix was acquired by IBM.

Ten years ago, in the summer of 2004, IBM contributed the code to the Apache Software Foundation as Derby.

For many years, Derby was one of the most active and most successful projects at Apache, with dozens of committers and contributors building new features and fixing bugs, and the project produced release after release after release of new software. Both IBM and Sun Microsystems made substantial commitments to the project, providing material resources such as testing labs and equipment, but more importantly employing some of the most talented engineers I've ever had the pleasure of working with, and enabling those engineers to work on Derby.

It was an open source nirvana.

But in recent years, the community has struggled.

Sun Microsystems, of course, collapsed during the Great Recession of 2008, and in 2009 was sold to Oracle Corporation. IBM remains an independent corporation but is suffering greatly as well.

The end result is that, over the last year, both Oracle and IBM have essentially halted their support of the Derby project. Certainly in both cases this was done for valid and undoubtedly necessary business reasons, but the impact on the Derby project is severe.

It's hard for a non-programmer to understand the attachment that a programmer feels to their code. It's just an inanimate thing, code, but when you spend 20 years devoting almost every waking minute to thinking about it, and concentrating on it, and giving it your best, you grow powerfully attached to that code.

I feel bad for all the friends that I've made over the years, and wish them well. Such a collection of brilliant Java programmers has rarely been assembled, and I am sure that they are all going to move on to much better and brighter prospects.

And I feel bad for the Derby project, which was, at one time, a poster child for what an open source project could be, and for what the open source development process could produce, but is now a codebase whose future, frankly, must be considered to be in doubt.

Personally, I continue to enjoy working with the Derby codebase, and it is a professional interest of mine, so I hope to remain involved with the project as long as Apache will allow it to continue.

I'm not sure why I felt the need to post this, but I didn't want Derby to just quietly fade away without somebody taking a minute to salute it, and praise it, and record what was, what is, and (perhaps) what will be.

To close, let me share what is (I think) the last picture of the remaining Derby development team, taken last fall, just around the time that the people in question were learning the fate that their corporate masters had in mind for the work they devoted the greater portion of their professional lives to.

Tuesday, February 24, 2015

What I'm reading, late February edition

Working hard, reading a lot.

  • 13th USENIX Conference on File and Storage Technologies
    The full Proceedings published by USENIX for the conference are available for download below. Individual papers can also be downloaded from the presentation page. Copyright to the individual works is retained by the author[s].
  • http2 explained
    http2 explained describes the protocol HTTP/2 at a technical and protocol level. Background, the protocol, the implementations and the future.
  • You Had One Job, Lenovo
    When Lenovo preinstalled Superfish adware on its laptops, it betrayed its customers and sold out their security. It did it for no good reason, and it may not even have known what it was doing. I’m not sure which is scarier.
  • Lenovo PCs ship with man-in-the-middle adware that breaks HTTPS connections
    It installs a self-signed root HTTPS certificate that can intercept encrypted traffic for every website a user visits. When a user visits an HTTPS site, the site certificate is signed and controlled by Superfish and falsely represents itself as the official website certificate.
  • Superfish, Komodia, PrivDog vulnerability test
    Check the box below. If you see a "YES", you have a problem.
  • Extracting the SuperFish certificate
    I extracted the certificate from the SuperFish adware and cracked the password ("komodia") that encrypted it. I discuss how down below.
  • Exploiting the Superfish certificate
    As discussed in my previous blogpost, it took about 3 hours to reverse engineer the Lenovo/Superfish certificate and crack the password. In this blog post, I described how I used that certificate in order to pwn victims using a rogue WiFi hotspot
  • How to target XP with VC2012 or VC2013 and continue to use the Windows 8.x SDK
    One of the limitations of the Microsoft provided solution for targeting XP while using Visual Studio 2012 (Update 1 and above), or Visual Studio 2013, is that you must use a special “Platform toolset” in project properties that forces usage of the Windows SDK 7.1 (instead of Windows 8.x SDK which is the default). The other function the platform toolset provides is that it sets the Linker’s “Minimum Required Version” setting to 5.01 (instead of 6 which is the default). But that function can just as easily be done manually by setting it in project properties.
  • There are too many shiny objects and it is killing me
    The rest of the day is then used renting a VPS server, installing Linux (for the cool-factor) and going through the mandatory list of essential stuff I need, like version managers, package managers, vim bundles, custom prompts, terminal colors, and so forth. Somewhere along the way I get sidetracked and I dump the Linux installation and install Windows Server.
  • Shipping Culture Is Hurting Us
    Quickly getting something in front of the people that will actually use it is a great idea. It means you waste less time building something they don’t actually want. But I look around the industry today and I get worried. Don’t get me wrong – I see brilliant people shipping brilliant, innovative software. But I also see a lot of us using half-baked technologies to shove half-assed software out the door.
  • Programming Achievements: How to Level Up as a Developer
    We've all had specific experiences that clearly advanced our skills as developers. We've learned a new language that exposed us to a new way of thinking. Or we crafted the perfect design, only to watch it unveil its gross imperfections in the harsh realities of a production environment. And we became better programmers because of it. Some experiences equip you with new techniques. Others expose you to anti-patterns...and allow you to understand why they are anti-patterns. It's these experiences that teach you, that influence your thought process, that influence your approach to problems, that improve your designs.
    Musicians get better by practice and tackling harder and harder pieces, not by switching instruments or genres, nor by learning more and varied easy pieces. Ditto almost every other specialty inhabited by experts or masters.
  • An Ideal Conversation
    This is an article about basic conversation mechanics. It’s not about what motivates the person sitting across from you, it’s about some of the quirks you’ll encounter as the conversation occurs.
  • Procedural City Generation
    Three layers of simplex noise were combined to define the population density map. The resulting map has two purposes. One is to guide the forward extension of existing road segments; if a random deviation will reach a higher population than extending the original segment straight ahead, the extension will match that deviation. The second purpose of the population map is to determine when normal road segments should branch off from a highway - when the population along the highway meets a defined threshold.
  • MAS.S66: Indistinguishable From…Magic as Interface, Technology, and Tradition
    With a focus on the creation of functional prototypes and practicing real magical crafts, this class combines theatrical illusion, game design, sleight of hand, machine learning, camouflage, and neuroscience to explore how ideas from ancient magic and modern stage illusion can inform cutting edge technology.

    Guest lecturers and representatives of Member companies will contribute to select project critiques. Requires regular reading, discussion, practicing magic tricks, design exercises, a midterm project and final project.

  • Iseb: The Worm Maiden
    If you are a true tunnel fan, maybe a true fanatic, then when you hear about boring machines (the mothers of all tunnels) then it makes you want to see them. Good. So that’s what happened to us. Like grimey servants we followed every new trace that could lead us to her, the aim of our two year quest was always to see the toughest of all the machines. A dormant juggernaut that lies underground.

Monday, February 23, 2015

Five Thoughts on Software Testing

I felt like sharing some thoughts on testing, not necessarily related to any particular incident, just some things that were on my mind.

  • The easiest time to add the tests is now.

    It always seems to be the case that the time to write the tests is sometime later.

    • "I'm pretty much done; I just have to write some tests."
    • "If I write a lot of tests now, I'll just end up changing them all later."
    • "We're still in the design phase; how can we write any tests now?"

    I'm here to tell you that you can, and should, write those tests now. The main reason is that the sooner you write tests, the sooner you can start running those tests, and thus the sooner you will start benefiting from your tests.

    And I often write my tests when I'm still in the design phase; let me explain what I mean by that. While I'm noodling along, thinking about the problem at hand, toying with different ways to solve it, starting to sketch out the framework of the code, I keep a pad of paper and a pen at hand.

    Each time I think of an interesting situation that the code will have to handle, I have trained myself to immediately make a note of that on my "tests" pad.

    And as I'm coding, as I'm writing flow-of-control code, like if tests, while loops, etc., I note down additional details, so that I'm keeping track of things that I'll need to test (both sides of an if, variables that will affect my loop counters, error conditions I'll want to provoke in my testing).

    And, lastly, because I'm thinking about the tests while I design and write the code, I'm also remembering to build in testability, making sure that I have adequate scaffolding around the code to enable me to stimulate the test conditions of interest.

  • Tests have to be designed, implemented, and maintained.

    Tests aren't just written once. (Well, good ones aren't.) Rather, tests are written, run, revised, augmented, adjusted, over and over and over, almost as frequently as the code itself is modified.

    All those habits that you've built around your coding, like

    • document what you're doing, choose good variable names, favor understandable code whenever possible
    • modularize your work, don't repeat yourself, design for change
    All of those same considerations apply to your tests.

    Don't be afraid to ask for a design review for your tests.

    And when you see a test with a problem, take the time to refactor it and improve it.

  • Tests make your other tools more valuable.

    Having a rich and complete set of tests brings all sorts of payback indirectly.

    You can, of course, run your tests on dozens of different platforms, under dozens of different configurations.

    But there are much more interesting ways to run your tests.

    • Run your tests with a code-coverage tool, to see what parts of your codebase are not being exercised.
    • Run your tests with analyzers like Valgrind or the Windows Application Verifier
    Dynamic analysis tools like Valgrind are incredibly powerful, but they become even more powerful when you have an extensive set of tests. You can start to think of each test that you write as actually enabling multiple tests: your test itself, your test on different platforms and configurations, your test under Valgrind's leak detector, your test under Valgrind's buffer overrun detector, etc.
  • Keep historical records about running your tests

    As you're setting up your CI system to execute your test bed, ensure that you arrange to keep historical records about running your tests.

    At a minimum, try to record which tests failed, on what platforms and configurations, on what dates.

    A better historical record will preserve the output from the failed tests.

    A still better historical record will also record at least some information about the successful tests, too. Most test execution software (e.g., JUnit) can produce a simple output report which lists the tests that were run, how long each test took, and whether or not it succeeded. These textual reports, whether in HTML or raw text format, are generally not large (and even if they are, they compress really well), so you can easily keep the records of many thousands of test runs.

    Over time, you'll discover all sorts of uses for the historical records of your test runs:

    • Looking for patterns in tests that only fail intermittently
    • Detecting precisely when a regression was introduced, so you can tie that back to specific changes in the SCM database and quickly find and repair the cause
    • Watching for performance regressions by tracking the performance of certain tests designed to reveal performance behaviors
    • Monitoring the overall growth of your test base, and relating that to the overall growth of your code base
  • You still need professional testers.

    All too often, I see people try to treat automated regression testing as an "either or choice" versus having professional test engineers as part of your staff.

    You need both.

    The real tradeoff is this: by investing in automated regression testing, by having your developers cultivate the habit and discipline of always writing and running thorough basic functional and regression tests, you free up the resources of your professional test engineers to do the really hard stuff:

    • Identifying, isolating, and reproducing those extremely tough bug reports from the field.
    • Building custom test harnesses to enable cross-platform interoperability testing, upgrade and downgrade testing, multi-machine distributed system testing, fault injection testing, etc.
    • Exploratory testing
    • Usability testing
    All of those topics that never seem to find enough time are within your reach, if you just build a firm and solid foundation that enables you to reach out for the wonderful.

Oh, and by the way: Leadership is not the problem!

Saturday, February 21, 2015

It has that name for a reason

Over the last six weeks or so, we've been bothered by a smell, an odor, in our pantry.

Several times, we've hunted through it, digging around, trying to figure out what was causing it.

The leading candidates were the two packages of salmon-liver and bison dog training treats.

Indeed, they are detectable even to our poor human noses, and of course are many hundreds of times more interesting to our dear old black Lab.

But we sealed them up in ziplocs and moved them elsewhere, and they were not the issue.

It seemed rather sulfurous, so much so that we even called the Gas Company and asked them if we somehow had a leak, though it was nowhere close to the supply lines.

Dutifully, they sent a nice man. He agreed that there was a definite smell, and that it was not that far from the methane that they deliberately introduce to natural gas for exactly this reason.

But it was not a gas leak.

Infuriated, I finally unloaded the entire pantry into our living room, spreading things out everywhere (it's rather a large pantry).

And, just as I thought might happen, suddenly Donna exclaimed:

Oh, Ugh! Yes, this is it!

And once we both saw the culprit, I instantly understood why.

Let's pick up the story from the delightful article in a 2009 issue of Saudi Aramco World: "Devil's Dung": The World's Smelliest Spice.

I bought a fist-sized lump of brown-gray resin. Slightly sticky to the touch, it was as dense as a block of wood. Mostly, though, it was remarkable for its terrible, aggressive smell—a sulfurous blend of manure and overcooked cabbage, all with the nose-wrinkling pungency of a summer dumpster. The stench leached into everything nearby, too, which meant I had to double-wrap it and seal it in a plastic tub if I wanted to keep it in the kitchen.

About six months ago, we were trying to cook some recipe, and it called for Asafoetida.

Which we didn't have.

So, we did without (and the recipe was fine).

But then we happened to be in an Indian grocery sometime around the holidays, and I pointed out a jar on the shelf: "Look, dear, they have asafoetida! Shall we buy some, so in case we ever cook a recipe which has it, we'll have it at hand?"

Now, I'm not so sure that was a wise idea.

But at least the Great Mystery of the Pantry Odor is solved.

Americanah: A very short review

I happened to be taking several cross-country plane trips recently, and I brought along Chimamanda Ngozi Adichie's Americanah.

It was a perfect book for reading in wild intense bursts while confined to a too-small seat for too many hours.

Adichie writes superbly: reading her book is both effortless and enthralling. Events feel real and immediate; the characters seem as though they are speaking directly to you. Americanah gives you that wonderful sense that somehow you are sitting on the shoulder of the protagonist, like Jiminy Cricket to Pinocchio, seeing, hearing, touching, even thinking everything right along with Our Hero.

Now, I should say: this is a book about what it's like to be a completely different person than I am. So while I really appreciated Adichie's sharing those emotions and experiences with me, I am (I hope) humble enough to understand that the pain and the sorrow and the trauma that she discusses will never be even the remotest part of my life. At times it makes me uncomfortable; perhaps she is aware of that; I don't think she means to cause that discomfort, at least not directly, but I suspect she would be satisfied to know that it does in fact result.

I wasn't even the slightest bit disappointed in Americanah. I hope it finds many readers; I hope she finds many readers; I hope she writes many more wonderful books.