Journal of a Programmer: February 2017

Tuesday, February 28, 2017

The Lonely Planet Travel Anthology: a very short review

The Lonely Planet Travel Anthology is, probably, just what you think it is: under the Lonely Planet brand name, the fairly-well-known travel editor Don George gathered short essays from a significant collection of other quite-well-known authors, and published them as a book.

All of the essays, unsurprisingly, take "travel" as a theme, and do so fairly faithfully.

None of the essays was a poor effort, or was a waste of time to read. Some of the essays are a little forced, but most are quite good, and a few are superb.

In particular, Rebecca Dinerstein's Small Lights in Large Darkness, about her mid-winter visit to far-northern Norway, is nearly perfect, every word a carefully-selected gem. Here, she visits the local graveyard:

Like the air, these lives had frozen their impurities away, had been preserved, had lasted. The trees rising behind the graveyard were pink because it was midday and the sun was both rising and setting. The snow was blue. As Mr Lockwood does at the conclusion of Wuthering Heights, I 'wondered how anyone could ever imagine unquiet slumbers for the sleepers in that quiet earth.' My friend wished her mormor (mother-mother, a perfectly simple construction for 'grandmother') a God Jul, a Merry Christmas.

Other stories take their delight from the plot, if not so much from the elegance of the prose. There are adventure stories, romance stories, metaphysical awakenings, and ruminations upon life and death. Our narrators encounter wild animals, struggle with language barriers, eat all sorts of different foods, confront social and cultural differences, and bring it all wonderfully to life.

Most of the stories are uplifting or at least reflective; a few are flat-out heart-breakingly tragic.

On the reflective side, I was particularly taken by Jane Hamilton's observations about the "validity" of a travel experience:

It took years, too, for the sorcery of the warden to wear off, to come to the the conclusion that no one owns the raw material of experience. It is now dangerously politically incorrect to say so but here it is: I don't believe you have to suffer in the particular way of someone else's interest group or tribe to understand suffering. The warden of Raasay, that variety of wicked queen, c'est moi. Belle Stewart, weary old singer with visions, check.
As for the story of the Dane -- that's certainly my material, and therefore it not only has a keen specificity, it now has scope: I can see how the event radiated into my future. In truth, however, it could be anyone's material. My story is merely a shape, a structure which can hold infinite variety.

Although not infinite, The Lonely Planet Travel Anthology has variety indeed, and nearly all of it worth your time.

If you should happen upon this small volume at some point, give it a try. I believe you'll enjoy it.

Monday, February 27, 2017

Five nines

In the world of cloud computing (which has recently become a big part of my professional life), the holy grail is to achieve "five nines".

Five nines has a variety of definitions, for example "no more than 1 failure per 10,000 requests", or "99.999% availability" (do you see the "five nines" in that second formulation?).

Achieving this level of availability in your service is UNBELIEVABLY hard; only a tiny handful of organizations in existence today can even aspire to that level of service.

But how do you know if you've achieved it?

Well, here's one way to know: Spanner, TrueTime and the CAP Theorem

For locking and consistent read/write operations, modern geographically distributed Chubby cells provide an average availability of 99.99958% (for 30s+ outages) due to various network, architectural and operational improvements. Starting in 2009, due to “excess” availability, Chubby’s Site Reliability Engineers (SREs) started forcing periodic outages to ensure we continue to understand dependencies and the impact of Chubby failures.

The above paragraph may have seemed like gobbledy-gook, so let me try to rephrase it:

Some of Google's internal services are now SO reliable that Google actually intentionally crashes them once in a while, just to let their engineers practice handling those failures.

"Excess availability." That's a problem that you don't ever hear about. Amazing.

Sunday, February 26, 2017

Bad apples

There's a lot of stuff to think about, recently. A SHA-1 collision was displayed; a bad security bug was uncovered; and of course there's the weather.

But what's really been on my mind recently is working, as is perhaps to be expected since I just changed jobs.

It's not easy for me to select a place of employment. I spend an enormous amount of time at my office, and the people I spend my time with are among the best friends I've made. I've been lucky to have had a career, and to have had a certain amount of lucky opportunities in that career, which have allowed me to have, by and large, an extremely enjoyable professional life.

Some people, however, are not so lucky. For example, Susan Fowler: Reflecting On One Very, Very Strange Year At Uber

Women were transferring out of the organization, and those who couldn't transfer were quitting or preparing to quit. There were two major reasons for this: there was the organizational chaos, and there was also the sexism within the organization.

Mike Isaac in the New York Times presents some evidence that this was not an isolated incident: Inside Uber’s Aggressive, Unrestrained Workplace Culture

Interviews with more than 30 current and former Uber employees, as well as reviews of internal emails, chat logs and tape-recorded meetings, paint a picture of an often unrestrained workplace culture. Among the most egregious accusations from employees, who either witnessed or were subject to incidents and who asked to remain anonymous because of confidentiality agreements and fear of retaliation: One Uber manager groped female co-workers’ breasts at a company retreat in Las Vegas. A director shouted a homophobic slur at a subordinate during a heated confrontation in a meeting. Another manager threatened to beat an underperforming employee’s head in with a baseball bat.

The two articles paint a truly dreadful picture.

Other writers present a more nuanced view, but still, there is clearly a severe "Brogrammer" problem going on here.

And it remains a rampant problem throughout the industry.

So, for now, let's stop talking (just) about Uber.

And let's have that broader discussion.

For my contribution, I'm going to suggest that a large part of the problem is: how we hire.

Americans are awful at figuring out how to decide who gets what job. In particular, we unconsciously (or perhaps even consciously) discriminate against old people and against people with unrelated and irrelevant other issues.

That's just a general problem, and is very hard to fix.

A much more specific, and much more fixable problem, plagues the software industry. For lack of a better term, I'll call it: "Google does it, so it must be good."

By which I mean that company after company models their hiring process on Google's.

Google are well-known for their extremely rigorous, and yet fundamentally bizarre, approach to hiring. Now, this approach seems to mostly work for Google. In service of their desire to be objective, and scientific, and precise, Google have worked very hard to make hiring as cut-and-dried and impersonal as possible.

But many other high tech companies, in their desire to be the next Google, parrot Google's every behavior, but they don't understand WHY Google do things the way they do, so the result is disasters like this: My Interview at Uber

My next interview was in yet another room. My recruiter once again came back to grab me and walked me to the new location. On the way I got to see a bunch of young Uber employees hanging out/working in various nooks and spaces that were designed to look like a mix between a Club and Google office, but were clearly neither.
Next interview consisted of two engineers (a guy and a girl, both young and smart) asking me some basic front end questions. By that point, I had a pretty good feeling that Uber was not for me, and I probably was not for Uber, but we went through the motion. The guy interviewer was so tired from “staying up last night working” that he drank two energy drinks during our interview and forgot his laptop when he left.
My next interview was in the same room, with two more engineers – an Uber old-timer and a recently hired engineer. The old-timer guy seemed completely checked out and the newer girl seemed not fully engaged. We talked for about 40 minutes about some random stuff and they passed me onto my recruiter.

This, in my opinion, is exactly where the problem resides: YOU DON'T WANT HIRING TO BE IMPERSONAL!

Software engineering is a very social, very human, very interactive occupation. A bunch of extremely smart people sit around together, and talk, and look at problems from different perspectives, and think about ways those problems could be solved, and discuss what might be good or bad about those various approaches, and after it's all said and done some software emerges.

Sometimes the software doesn't even emerge; software engineering is just that hard.

But here's the thing: software engineering is much less about coding, and much more about communication, then people think.

I'm not saying that coding is easy; it's an extremely hard thing, and you want to find people who are great coders.

But it's even MORE important that they are great PEOPLE.

If you try to take the "human" out of these "human resources", you get the Google Interview Process, with its detached, anonymous, and dispassionate panels of interviewers-in-lab-coats, and what results is this: Why I Don't Talk to Google Recruiters

I wanted to be interviewed by the person who really needed me: my future boss. That person will understand my profile and won't ask pointless questions about algorithms, simply because he or she will know what my duties will be and what kind of problems I will be capable of solving, if they hired me.

This issue goes both ways: not only is it critical for your company that you involve the people who will be your co-workers in the selection process, it's critical for the interviewee, as well. Why is that? Well, the one question that absolutely MUST be answered during the interview process, and which the Google process goes to extreme lengths to AVOID answering, is this:

Who are these people who I would be working with? Are they good people, or are they jerks? Can I communicate with them? Do they drive me crazy? Can I envision spending hours and days and weeks and months and years of my life sitting in meeting rooms with them, talking about anything and everything?

You can include this in your interview process, and yet still be very rigorous and keep very high standards. For example, at my new job, the interview process was long and involved and highly-structured; it included a multi-hour programming test, as well as several days of lengthy in-person and remote interviews with various different team members. I talked with my manager, my manager's manager, various team leads of related teams, as well as nearly every member of the team I actually joined. We spent hours sitting in conference rooms together, discussing arcane details of software engineering, alternating back-and-forth between them grilling me and me grilling them.

But there were two very important differences between this approach, and the Google approach, and these two differences were actually quite closely related:

Firstly, everybody I talked with (save a few corporate administrative staff) was directly involved with the team I was being considered for, and with the project I was a candidate to join.
That meant that they cared deeply about making sure that I was the "right" person for that role, and, conversely, I cared deeply about making sure that they were the "right" people for me to work with.
Secondly, it turns out that my new company has an fundamental rule that actually outranks the criteria of selecting for technical excellence and software aptitude.
Their number one hiring criterion is: "no bad apples" (well, internally they use a rather coarser term, but the idea is the same).

That is, you want the best employees, but "best" is a complicated thing.

And, clearly, the high tech industry as a whole is not getting it right.

Friday, February 24, 2017

Little oases in this frenzied modern life

Quite often, it seems, I have enough excitement, activity, and worry during my daylight hours, so when I find a few minutes in the evening to sit down and watch TV, I increasingly find myself drawn to search for something calm and relaxing.

If this describes you as well, let me make three recommendations:

The Detectorists
The Detectorists is the story of Andy and Lans, two friends who live in a small rural English town, and who have a somewhat unusual hobby. They are members of a Detectorists's Club, a society of like-minded individuals who enjoy taking long walks in the meadows and pastures with their metal detectors, searching (ever searching) for a long-lost Roman tomb, a Viking settlement, or perhaps a World War site.
But really, it is just about Andy and Lans, who prove to be incredibly likely bumblers.
And, not to be undersold, the music is by Johnny Flynn, one of the best new musicians I know of.
Mozart in the Jungle
Adapted from a book which I haven't read, Mozart in the Jungle is all about a rather on-the-rocks symphony orchestra in New York City.
Narrated by the young, aspiring backup oboe player, and featuring an absolutely mesmerizing performance by Gael Garcia Bernal as the orchestra's conductor, Mozart in the Jungle is really a "character actor" show, with literally dozens of delightful kooks who swim in and out of the life of the orchestra. Simultaneously heart-warming, philosophical, and hilarious, Mozart in the Jungle never lets you down.
And, of course, it, too, has great music.
Chef's Table
Chef's Table is a documentary series, with a brilliant concept. Each episode is separate and stands alone, and each episode tells the story of one world-famous chef or another.
Beautifully filmed, beautifully edited, beautifully paced, these shows take you deep inside the mind of what it means to be a top-level, world-class chef.
I think that any creative activity is fascinating, each in its own way. Writing software is a deeply creative act, but so, too, is being a chef. Sitting and listening to each of these chefs talk about how they approach their art, what stumbles and failures they've had, how they dealt with them, who mentored them along the way, and what turned out to really matter to them in the end, is just unbelievably compelling. I could sit for hours (actually, I HAVE sat for hours) listening to these stories.

Saturday, February 18, 2017

The Lost City of Z: a very short review

Several years ago, I picked up David Grann's The Lost City of Z: A Tale of Deadly Obsession in the Amazon.

I think I was looking for some summer vacation reading, or a book to take backpacking. As I recall, the book was "too heavy" to take on my backpacking trip: it's 400 pages, and printed on nice quality paper, so it actually weighs a full pound (14.4 ounces according to the publisher). And then it somehow slipped down on the stack and I forgot about it.

But it floated back to the top of the stack the other day, and I finally read it. Just in time, for I see they've now made a movie based on the book.

No surprise there; this is a book which could make quite a good movie, I think.

At times, I had to remind myself that everything in this story is true: a movie star really was abducted by Indians; there were cannibals, ruins, secret maps, and spies; explorers died from starvation, disease, attacks by wild animals, and poisonous arrows; and at stake amid the adventure and death was the very understanding of the Americas before Christopher Columbus came ashore in the New World.

The Lost City of Z is several books at once:

It's the story of Percy Fawcett:
He was the last of the great Victorian explorers who ventured into uncharted realms with little more than a machete, a compass, and an almost divine sense of purpose. For nearly two decades, stories of his adventures had captivated the public's imagination: how he had survived in the South American wilderness without contact with the outside world; how he was ambushed by hostile tribesmen, many of whom had never before seen a white man; how he battled piranhas, electric eels, jaguars, crocodiles, vampire bats, and anacondas, including one that almost crushed him; and how he emerged with maps of regions from which no previous expedition had returned.
It's the story of Grann's own obsession with learning about Fawcett and how it seemed to bring out a different person in the author:
Let me be clear: I am not an explorer or an adventurer. I don't climb mountains or hunt. I don't even like to camp.
...
But when I'm working on a story, things are different. Ever since I was young, I've been drawn to mystery and adventure tales, ones that had what Rider Haggard called "the grip."
...
While most of my articles seem unrelated, they typically have one common thread: obsession. They are about ordinary people driven to do extraordinary things -- things that most of us would never dare -- who get some germ of an idea in their heads that metastasizes until it consumes them.
And it is the story of the upper Amazon region itself, and of the people who live there:
The region has generally been regarded as a primeval wilderness, a place in which there are, as Thomas Hobbes described the state of nature, "no Arts; no Letters; no Society; and which is worst of all, continuall feare, and danger of violent death." The Amazon's merciless conditions have fueled one of the most enduring theories of human development: environmental determinism. According this theory, even if some early humans eked out an existence in the harshest conditions on the planet, they rarely advanced beyond a few primitive tribes. Society, in other words, is a captive of geography.

It's that third story that I found the most interesting.

Fawcett himself, as it turns out, is rather an unpleasant person, and frankly I wasn't terribly interested in learning more about him.

Grann's modern journey of self-exploration and his own trip into the Amazon jungle is interesting enough, and I enjoyed reading about his own adventures.

But what was really enjoyable about the book was what Grann found there.

In the cave and at a nearby riverbank settlement, Roosevelt made another astonishing discovery: seventy-five-hundred-year-old pottery, which predates by more than two thousand years the earliest pottery found in the Andes or Mesoamerica. This means that the Amazon may have been the earliest ceramic-producing region in all the Americas, and that, as Fawcett radically argued, the region was possibly even a wellspring of civilization throughout South America -- that an advanced culture had spread outward, rather than vice versa.
Using aerial photography and satellite imaging, scientists have also begun to find enormous man-made earth mounds often connected by causeways across the Amazon -- in particular in the Bolivian floodplains where Fawcett first found his shards of pottery.
...
Heckenberger told me that scientists were just beginning the process of understanding this ancient world -- and, like the theory of who first populated the Americas, all the traditional paradigms had to be reevaluated.
...
"These people had a cultural aesthetic of monumentality," he said. "They liked to have beautiful roads and plazas and bridges. Their monuments were not pyramids, which is why they were so hard to find; they were horizontal features. But they're no less extraordinary."
...
"Anthropologists," Heckenberger said, "made the mistake of coming into the Amazon in the twentieth century and seeing only small tribes and saying, 'Well, that's all there is.' The Problem is that, by then, many Indian populations had already been wiped out by what was essentially a holocaust from European contact. That's why the first Europeans in the Amazon described such massive settlements that, later, no one could ever find."

The Lost City of Z is a great read; I'm glad it finally floated back up to the top of my pile.

I will probably never make it to the upper Amazon myself, but I'm quite glad that David Grann took me there, and that he took the time to open his own eyes to what really happened, and is happening, there.

Friday, February 17, 2017

Perhaps the forecast is wrong ...

Right now, they are forecasting FOUR INCHES of rain in the 36-hour period starting Sunday morning and continuing through Monday.

My Mediterranean climate has transmogrified into a rain-forest.

When the rain falls, and the snow melts, ...

... all the water west of the Sierras has only one place to go:

From: BayAlerts
Subject: Storm Debris Affects Ferry Service

A message from HARBOR BAY/SAN FRANCISCO

Recent storms have washed an unprecedented amount of debris into the San Francisco Bay. On 4 occasions, debris has damaged vessel propellers, resulting in delays and service cancelations. And, on numerous occasions, debris has been sucked into waterjet propulsion systems, requiring crews to clear the system before proceeding. We are doing our best to steer around major obstacles while adhering to published schedules. We thank you for your patience and understanding.

Wednesday, February 15, 2017

Up, up and away! But, carefully, please!

Not two hours earlier, I was walking along the sidewalk in front of this construction site...

Concrete wall threatening to fall from SF high-rise stabilized

At least 16 office buildings in the South of Market neighborhood of San Francisco were evacuated Wednesday afternoon as a malfunctioning crane in a 35-story skyscraper under construction threatened to topple a 2,000-pound wall of concrete, officials said.
People were evacuated from seven buildings on Howard Street, two on Tehama Street and one on Second Street shortly before 3 p.m. as a crane began leaning precariously into a concrete wall, putting the wall at danger of falling, said Lt. Jonathan Baxter, a spokesman for the San Francisco Fire Department.
The crane was located inside the 30th floor of 33 Tehama Street, between First and Second streets.
“If the crane falls, it could take the concrete wall down,” said Baxter. “The worst case scenario is we’re going to have some structural damage to one or more buildings below.”
He said another concern is that parts of the slab could ricochet and hit surrounding buildings.

Beware of walls of concrete which ricochet into surrounding buildings; yikes!

Sunday, February 12, 2017

Oroville Dam situation

Oh dear, this could be REALLY bad: BREAKING: Marysville, Yuba County evacuated as Oroville spillway collapse feared

Releases through the main spillway at Oroville Dam have been boosted to 100,000 cubic feet per second from 55,000 cfs in hopes of easing pressure on the emergency spillway before a failure occurs, officials said Sunday night.
Kevin Dossey, a Department of Water Resources engineer and spokesman said “it might help” to alleviate the pressure.
So far, Dossey said, the emergency spillway’s concrete lip at the top has not crumbled, although the hillside had “eroded to within several feet” of the big concrete structure.
Marysville and Yuba County ordered evacuated; officials unsure how Sacramento could be hit.
When asked how much water could be released should the spillway collapse, DWR spokesman Chris Orrock said, “It’s uncontrolled. It’s uncontrolled.”

Let's keep our fingers crossed for a little bit of luck, to help all those hard-working souls...

Saturday, February 11, 2017

Read, read, read, read, ...

... learn, learn, learn, learn

The Security Impact of HTTPS Interception
As a class, interception products drastically reduce connection security. Most concerningly, 62% of traffic that traverses a network middlebox has reduced security and 58% of middlebox connections have severe vulnerabilities. We investigated popular antivirus and corporate proxies, finding that nearly all reduce connection security and that many introduce vulnerabilities (e.g., fail to validate certificates). While the security community has long known that security products intercept connections, we have largely ignored the issue, believing that only a small fraction of connections are affected. However, we find that interception has become startlingly widespread and with worrying consequences.
Facebook is terrifying
Now imagine you take a selfie in a crowded place. Like an airport or a train station. There are some people walking on the background. Hundreds of them. Some of them facing the camera. Guess what: the Facebook’s AI has just spotted them.
Even if you’re extremely cautious, even if you never post anything on Facebook, even if you have “location services” disabled on your phone at all times etc. etc. Facebook still knows where you are. You can’t stop other people from taking selfies in an airport.
On Deniability and Duress
Deniable schemes let you lie about whether you’ve provided full access to some or all of the encrypted text. This is important because, currently, you can’t give the guard in the above example a fake password. He’ll try it, get locked out, and then proceed with the flogging.
I’m convinced that there’s a sociotechnical blind spot in how current technology handles access to personal devices. We, in the infosec community, need to start focusing more on allowing users the flexibility to handle situations of duress rather than just access control. Deniability and duress codes can go a long way in helping us get there.
Backblaze Hard Drive Stats for 2016
Backblaze has recorded and saved daily hard drive statistics from the drives in our data centers since April 2013. At the end of 2016 we had 73,653 spinning hard drives. Of that number, there were 1,553 boot drives and 72,100 data drives. This post looks at the hard drive statistics of the data drives we monitor. We’ll first look at the stats for Q4 2016, then present the data for all of 2016, and finish with the lifetime statistics for all of the drives Backblaze has used in our cloud storage data centers since we started keeping track. Along the way we’ll share observations and insights on the data presented.
Building, And Losing, A Career On Facebook
Here's how the money part works: Just like Google and Facebook get paid to post advertisements (in your search and your news feed), Lawler gets paid to posts ads too — in his Facebook page — by a third party, an entity known as an "affiliate link" company. In the complex world of online advertising, these companies are middlemen between big brands like Home Depot and publishers. It's a standard practice for businesses on Facebook to post these advertising links. He'll share a link — it could be for a juice company or a news site — and every time a fan clicks on that link, he gets less than a penny.
But the money adds up. Lawler made anywhere from a couple of hundred dollars a day, to $1,000.
Asking the wrong questions
However, to me the interesting thing is how often the order is wrong. What we now know to be the hard problems were going to be solved decades before what we now know were the easy ones. So it might take until 2020 to 'fax' a newspaper to your home, and automatic wiretapping might be impossible, but automatic doctors, radar implants for the blind, household robots and machine translation would be all done by 1990 and a machine would be passing human IQ tests at genius level by 2000. Meanwhile, there are a few quite important things missing - there is no general-purpose computing, no internet and no mobile phones. There's no prediction for when everyone on earth would have a pocket computer connected to all the world's knowledge (2020-2025). These aren't random gaps - it's not just that they thought X would work and didn't know we'd invent Y. Rather, what's lacking is an understanding of the structural impetus of computing and software as universal platforms that would shape how all of these things would be created. We didn't make a home newspaper facsimile machine - we made computers.
Lessons from Real-Time Programming class
This class has been around since at least the 80’s. Currently Bill Cowan teaches this class, and has been for over 20 years.
The equipment have evolved since then, but the underlying challenges have not.
For example, teamwork, setting priorities, and dealing with real (imperfect) systems.
Cardinality estimation done right: index-based join sampling
The index-based sampling operator can cheaply compute a sample for a join result, but it is not a full solution by itself. We also need a join enumeration strategy which can systematically explore the intermediate results of a query using the sampling operator, while also ensuring that the overall sampling time is limited. If we sampled every possible combination, it would take too long for queries with many joins. In the Join Order Benchmark (JOB), queries with 7 joins have 84-107 intermediate results, and queries with 13 joins have 1,517-2,032. A time limit is set on the sampling phase, after which the algorithm falls back to traditional estimation.
Vim's 25th anniversary and the release of Vim 8
2016 was a big year for project anniversaries. The Linux kernel, of course, turned 25. And Vim, that other iconic text editor, also celebrated its 25th anniversary.
Monitoring and Tuning the Linux Networking Stack: Sending Data
This blog post explains how computers running the Linux kernel send packets, as well as how to monitor and tune each component of the networking stack as packets flow from user programs to network hardware.
This post forms a pair with our previous post Monitoring and Tuning the Linux Networking Stack: Receiving Data.
MIT Lecture: Gödel Escher Bach; an Eternal Golden Braid
MIT Open CourseWare videos investigating Doug Hofstadter's classic book.
Online migrations at scale
Moving millions of objects from one database table to another is difficult, but it’s something that many companies need to do.
There’s a common 4 step dual writing pattern that people often use to do large online migrations like this. Here’s how it works
Coronal Mass Ejections (again)
A study published last month by the Cambridge Centre for Risk Studies estimates that a solar storm would have the potential to wipe between $140 billion to $613 billion off the global economy in a five-year time span, depending on the severity of the impact.
Most of the web really sucks if you have a slow connection
When I was at Google, someone told me a story about a time that “they” completed a big optimization push only to find that measured page load times increased. When they dug into the data, they found that the reason load times had increased was that they got a lot more traffic from Africa after doing the optimizations. The team’s product went from being unusable for people with slow connections to usable, which caused so many users with slow connections to start using the product that load times actually increased.
Disaggregate: Networking recap
At Facebook, we build our data centers with fully open and disaggregated hardware. This allows us to replace the hardware or the software as soon as better technology becomes available. Because of this, we see compute, storage, and networking gains that scale with our business. We spoke about our latest networking hardware and software — including Wedge 100, Backpack, Voyager, FBOSS and OpenBMC — at the event. We also heard from Apstra, Barefoot, Big Switch Networks, Canonical, Cumulus, and SnapRoute, who talked about their solutions and how they fit in with the rapidly growing ecosystem for open networking.
Back-to-Basic Weekend Reading: Monte-Carlo Methods
The probabilistic approach may not result in the perfect result, but it may get you very close, and much faster than deterministic techniques (which may even be computationally impossible).
htop Explained Visually
htop is an interactive process monitor.
Using tmux Properly
What is a terminal multiplexer? A terminal multiplexer is a souped-up terminal. If you used a plain terminal for a few years and then someone said: "What features do you think we should add?", you'd end up with a multiplexer.
Against Storytelling
Here at the slightly pretentious hotel (call it “P”) we went down to breakfast early and got eggs. The salt and pepper shakers were tallish, stainless steel, with little plastic windows to see the spice. Atop each is a plunger—meaning each one is a little grinder. Interesting! Except they didn’t work.
More on GVFS
Looking at the server from the client, it’s just Git. All TFS and Team Services hosted repos are *just* Git repos. Same protocols. Every Git client that I know of in the world works against them. You can choose to use the GVFS client or not. It’s your choice. It’s just Git. If you are happy with your repo performance, don’t use GVFS. If your repo is big and feeling slow, GVFS can save you.
Considerations On Cost Disease
might the increased regulatory complexity happen not through literal regulations, but through fear of lawsuits? That is, might institutions add extra layers of administration and expense not because they’re forced to, but because they fear being sued if they don’t and then something goes wrong?
I see this all the time in medicine. A patient goes to the hospital with a heart attack. While he’s recovering, he tells his doctor that he’s really upset about all of this. Any normal person would say “You had a heart attack, of course you’re upset, get over it.” But if his doctor says this, and then a year later he commits suicide for some unrelated reason, his family can sue the doctor for “not picking up the warning signs” and win several million dollars. So now the doctor consults a psychiatrist, who does an hour-long evaluation, charges the insurance company $500, and determines using her immense clinical expertise that the patient is upset because he just had a heart attack.
Those outside the field have no idea how much of medicine is built on this principle. People often say that the importance of lawsuits to medical cost increases is overrated because malpractice insurance doesn’t cost that much, but the situation above would never look lawsuit-related; the whole thing only works because everyone involved documents it as well-justified psychiatric consult to investigate depression. Apparently some studies suggest this isn’t happening, but all they do is survey doctors, and with all due respect all the doctors I know say the opposite.
A Very Comprehensive Guide to Getting Drunk at Disney World
Animal Kingdom:
Let’s give credit where credit is due: the staff at Animal Kingdom diligently sourced hard-to-find African beers. Unfortunately, most of them are mundane, flavorless lagers, but since you’ll likely not find these brews at home, they serve their purpose.

Your choices at the Dawa Bar, located in the heart of Harambe Village near Kilimajaro Safari, are quite varied. There’s not much on tap, but the bottled beer selections are quite cheap and include standouts like the Hakim Stout and Bedele Pilsner, both procured from Ethiopia. Two solid American craft beers round out the menu, SweetWater IPA and Victory Golden Monkey Tripel Ale, and while I’m uncertain why they are present at an ostensibly African bar, they are tasty nonetheless. If you’re ambitious enough to visit in the morning, order an African Bloody Mary made with a spiced Ethiopian-style berbere sauce.

OK, now it has officially rained more than we can handle

There's trouble all around the state, but this is massive:

Inside the frantic fight to protect Oroville dam, nation's tallest, as spillway rapidly erodes
Officials have stressed that the dam itself suffered no damage and that the spillway problems don’t pose a imminent threat to the public. Still, they have been frantically working to reduce the amount of water in the Lake Oroville reservoir, which is near capacity. It’s now at about 96% of capacity, and more water has been flowing in than is draining out.
Officials say Oroville Dam not compromised but are preparing for the worst
As the concrete from the spillway falls into the Diversion Pool and Lake Oroville continues to fill, the California Department of Water Resources is sending about 35,000 cfs of water continually down the broken spillway.
Some of the water in the Diversion Pool is routed through the Thermalito Forebay and Afterbay where it warms before being used for crops. The rest, about 40,000 cfs as of Thursday, goes directly into the Feather River.
The river flows parallel to Montgomery Street, one of downtown Oroville’s main streets. Right now, the only thing separating the 40,000 cfs of water in the Feather River from flooding the downtown part of the city is about a hundred yards and a levee that holds back the water.
Eric See, a public information officer with the Department of Water Resources, said 150,000 cfs went through the Feather River in Oroville during a storm in 1997.
Use of untested emergency spillway yet again a possibility at crippled Oroville Dam
The Department of Water Resources announced it was dialing back water releases over the battered main spillway by about 15 percent to keep erosion along the side of the spillway from “compromising” the power line towers that fuel the dam’s power plant. That reduced releases to 55,000 cubic feet per second.
With that, the possibility that the reservoir would crest the lip of the emergency spillway was raised anew. The slower releases “may keep the lake level below 901 feet,” the point at which water would start topping the emergency structure, the department announced. However, “there are many variables involved, and the public should not be surprised if some water flows into the emergency spillway.”
Damage to Oroville Dam spillway worsens — could cost $100 million
“We’re going to lose a lot of the spillway,” said Chris Orrock, a spokesman for the California Department of Water Resources, which manages the nation’s tallest dam, about 75 miles north of Sacramento. “The director has said we are willing to lose the bottom of that spillway to make sure we maintain flood control for the downstream communities.”
...
The good news, Orrock said, is that the larger spillway, made of reinforced concrete, was peeling downward and not threatening the integrity of the 770-foot-high dam itself. “If the erosion was moving up toward the dam, they would stop the flow,” he said.
...
The dam’s spillway — and valves in the Edward Hyatt Power Plant at the bottom of the reservoir — were releasing 79,000 cubic feet per second of water Friday, but the flow was reduced overnight. About 130,000 cubic feet per second was flowing into the dam from the surrounding mountains.
If it were not damaged, the spillway could usher out up to 200,000 cubic feet per second of water, though that flow would be too much for the Feather River, which can handle 150,000 cubic feet per second without flooding.
At Oroville Dam, a break in the storms gives engineers hope
The break in storms and a drop in the volume of water pouring into the huge reservoir gave dam operators hope that they could keep lake levels from hitting an elevation of 901 feet — the point at which uncontrolled flows would start washing down an unpaved emergency spillway that has never been used in Oroville’s 48-year history.
“The sun is coming out. The rain has stopped. The inflow has peaked,” said Eric See, a spokesman for the state Department of Water Resources. “We still don’t expect to use the auxiliary spillway.”

Indeed, here in the Bay Area, 250 miles away, the sun is shining and it's a beautiful day, and there's no rain in the forecast.

Well, no rain in the forecast for the next 4 days.

Then the next storm is looming.

It's going to be a very, very, very busy spring for the entire state.

After so many years of below-average rainfall, it's a vivid reminder that California's enormous flood control projects were built for a reason: the peculiar geography of the state positions the entire Central Valley as a giant bathtub, 500 miles long and 60 miles wide, draining a region of over 60,000 square miles.

There's nowhere we can put all that water at this point; all the reservoirs are full.

And the enormous snowpack in the mountains hasn't even begun to melt yet; the real flooding season typically starts in mid-March and runs well through April. We're a full month ahead of schedule; the worst is DEFINITELY yet to come.

So, for the next 3 months at least, as all that snow melts, and runs down from the mountains, it's going to be one flood-control crisis after another.

Hold on for a wild ride.

And best of luck to the hard-working folk at the DWR; pack some extra thermoses of coffee: you're going to be booking a LOT of overtime between now and Memorial Day.

Thursday, February 9, 2017

Hillbilly Elegy: a very short review

During last spring and fall's election campaign, J.D. Vance's Hillbilly Elegy: A Memoir of a Family and Culture in Crisis received a lot of attention.

So, in typical Bryan fashion, I finally got around to reading Vance's book.

Most of Hillbilly Elegy, perhaps three-quarters of it, is indeed as advertised: it is Vance's memoir of his life growing up in a broken household in the hill country of Eastern Kentucky and, later, in Southwestern Ohio.

Is 35 years of age too young to write your own memoir? Possibly.

But Vance is a skilled writer, and he tells an engaging story, and the memoir portion of the book is certainly entertaining, if in that oh-my-I'm-not-sure-I-wanted-to-know-that sort of way.

But, honestly, I rather flew through the memoir portion of the book, because frankly I wasn't all that interested in Vance's crummy relatives or in the awful things they did to each other and to those around them.

If you haven't heard of the horrors of OxyContin abuse, perhaps reading Vance's memoir will inspire you to learn more of this awful situation. But if you're already pretty-well-informed on this issue, you may find this just more of the same.

But the other 25% or so of Vance's book is rather different.

Sprinkled throughout Hillbilly Elegy are sections where Vance pauses, reflects, and genuinely tries to understand why it is that he succeeded in a situation where so many others failed.

The central question, for Vance, is this:

People sometimes ask whether I think there's anything we can do to "solve" the problems of my community. I know what they're looking for: a magical public policy solution or an innovative government program.

This question fascinates Vance (as it does many of use), so he explores many different theories about how to address this problem, and gives them legitimate consideration, and reacts to them honestly and forthrightly.

For instance, he considers the theory "about education in America, which the majority of people rightfully believe is the key to opportunity". As Vance immediately concedes, this theory is "rightfully" to be believed. Yet, in his particular case, it was not the relevant element:

it was striking that in an entire discusssion about why poor kids struggled in school, the emphasis rested entirely on public institutions. As a teacher at my old high school told me recently, "They want us to be shepherd to these kids. But no one wants to talk about the fact that many of them are raised by wolves."
...
What I do know is that I was a sophomore in high school, and I was miserable. The constant moving and fighting, the seemingly endless carousel of new people I had to meet, learn to love, and then forget -- this, and not my subpar public school, was the real barrier to opportunity.

Don't (simply) blame the schools, says Vance.

Of course, Vance ends up attending Yale Law School, so clearly education, in the end, was crucial to his success. (And in his afterword, the first person he thanks is his Yale Law professor, for helping him out time and again.) But he persuasively argues that this was well after-the-fact: by the time he got to Yale Law School, he had dealt with the crippling problems of his youth already.

Vance is equally quick to dismiss the notion that the disparity in results has something to do with race:

Recall that not a single one of my high school classmates attended an Ivy League school. Barack Obama attended two of them and excelled at both.
...
Of course, Obama overcame adversity in his own right -- adversity familiar to many of use -- but that was long before any of us knew him.

So certainly don't try to explain the outcome as a racial issue, says Vance.

But what about talent, then? Perhaps some are just better than others?

Surrounding me was another message: that I and the people like me weren't good enough; that the reason Middletown produced zero Ivy League graduates was some genetic or character defect. I couldn't possibly see how destructive that mentality was until I escaped it. The Marine Corps replaced it with something else, something that loathes excuses.
...
In the Marines, giving it your all was a way of life.
I'm not saying ability doesn't matter. It certainly helps. But there's something powerful about realizing that you've undersold yourself -- that somehow your mind confused lack of effort for inability.

From Vance's perspective, what mattered about his military experiences was not the "armed forces" aspect of it, but rather the "improvement via hard work" aspect of it:

Every time the drill instructor screamed at me and I stood proudly; every time I thought I'd fall behind during a run and kept up; every time I learned to do something I thought impossible, like climb the rope, I came a little closer to believing in myself. Psychologists call it "learned helplessness" when a person believes, as I did during my youth, that the choices I made had no effect on the outcomes in my life. From Middletown's world of small expectations to the constantchaos of our home, life had taught me that I had no control.

So, what is it, then?

After looking here, and looking there, and talking to experts, and reading books, and, most crucially, reflecting long and hard and in great detail about his own life, Vance comes back, again and again, to one simple, yet confoundingly complex, explanation.

Family.

Or, more specifically in his case, his grandmother, "Mamaw", who took him in at a critical period and gave him the structure he needed.

Family, says Vance, is all-too-often the fundamental source of the problem:

Our homes are a chaotic mess. We scream and yell at each other like we're spectators at a football game. At least one member of the family uses drugs -- sometimes the father, sometimes the mother, sometimes both. At especially stressful times, we'll hit and punch each other, all in front of the rest of the family, including young children.

And what does this mean?

We don't study as children, and we don't make our kids study when we're parents. Our kids perform poorly in school. We might get angry with them, but we never give them the tools -- like peace and quiet at home -- to succeed.

Why is this family presence so crucial, in Vance's opinion?

Ask what made a difference in her life, and she'll tell you about the stable family that empowered her and gave her a sense of control over her future.
...
many succumb: to crime or an early death at worst, domestic strife and welfare dependency at best. But other make it.
...
Each benefited from the same types of experiences in one way or another. They had a family member they could count on. And they saw -- from a family friend, an uncle, or a work mentor -- what was available and what was possible.

Role models.

And, most importantly, role models that a child can believe in, and can trust, because they are family members who have overcome obstacles of their own.

One particularly interesting part of Hillbilly Elegy, I thought, was Vance's observation that modern American society struggles to comprehend the crucial importance of the extended family. Talking about a crucial episode in which Vance was nearly prevented from benefiting from the stabilizing and structuring experience of living with his grandmother, Vance observes:

Part of the problem is how state laws define the family. For families like mine -- and for many black and Hispanic families -- grandparents, cousins, aunts, and uncles play an outsize role. Child services often cut them out of the picture, as they did in my case. Some states require occupational licensing for foster parents -- just like nurses and doctors -- even when the would-be foster parent is a grandmother or other close family member. In other words, our country's social services weren't made for hillbilly families, and they often make a bad problem worse.

So, what are we to take from Hillbilly Elegy?

My father put it rather simply, and rather precisely, I think:

I thought the conclusions the author drew are applicable to a broader class, and I thought the extreme behavior of his Kentucky family did not generalize.

I completely agree: Vance's book, though troubling, disturbing, and often infuriating, is tremendously valuable. He has a lot to say, and it's not just about Kentucky Hillbillies. These are not easy problems, and no magic wand can wave them away, but they are widespread, they are important, and they are solvable.

Sometimes, you get to the end of a book, and you aren't quite sure what to make of it, but you hope other people read it, and you hope it matters, and you hope it makes a difference.

I guess that's where I ended up with Hillbilly Elegy.

Tuesday, February 7, 2017

Soggy morning

It's a good thing that the automated notification sent out a message about ferry service issues in this morning's storm:

Date: Tue, 07 Feb 2017 06:23:07 -0800
From: BayAlerts 
Subject: Harbor Bay Ferry Cancelation/westbound

A message from HARBOR BAY/SAN FRANCISCO

Please be advised that the following Harbor Bay ferry =
departure to San Francisco has been canceled: 06:30am due to inclement weather.

But they sure didn't leave much time to spare with that notification.

I guess, with the waves and all, you could certainly term that: "breaking news"

Saturday, February 4, 2017

Big news in the world of source control

Source control software is no longer the center of my professional life.

But you don't just relinquish something that occupied a decade of your mental attention overnight.

So it was that I spent much of the last week obsessed by two significant developments in the SCM industry.

Firstly, there was the widely-reported system outage at major Git hosting provider GitLab.

GitLab has been one of the biggest success stories of the past few years. The company is not even four years old, but already has several hundred full-time employees, thousands of customers, and millions of users. Their growth rate has been astonishing, and they have received quite a bit of attention for their open (and unusual) business organization.

(I know a number of GitLab employees; they're wonderful people.)

Anyway, as befits rather an unusual company, they had a rather unusual outage.

Well, the outage itself was not that interesting.

For reasons that I think are not yet well-understood, the GitLab site came under attack by miscreants:

At 2017/01/31 6pm UTC, we detected that spammers were hammering the database by creating snippets, making it unstable. We then started troubleshooting to understand what the problem was and how to fight it.

Then, while trying to defend against and recover from the attack, a harried and over-tired systems administrator made a very simple fumble-fingered mistake, typing a command interactively at the keyboard which deleted their primary production database (instead of deleting what he thought was the damaged spare standby database):

YP thinks that perhaps pg_basebackup is being super pedantic about there being an empty data directory, decides to remove the directory. After a second or two he notices he ran it on db1.cluster.gitlab.com, instead of db2.cluster.gitlab.com

Ah, yes: an interactive "rm -rf pgdata". Been there, done that.

What was interesting about the outage was the way that this (quite unusual) company responded to the problem in a (quite unusual) way.

Almost as soon as things started to occur, they made the decision to create a public Google Docs document, and they live-streamed the event on their Twitter account, and shared their attempts to control and recover from the event, in real-time, inviting the community and the public at large to contribute and assist and understand what was going wrong and what they were doing to recover. You can read the document, it's fascinating.

Moreover, the incident resulted in a number of thoughtful and considered essays from various people. Because the GitLab incident specifically involved their Postgres database, some of the best analyses came from the Postgres community, such as this one: PG Phriday: Getting Back Up

Unfortunately, scenarios beyond this point is where process breaks down. What happens if we have a replica that falls behind and needs to be rebuilt? For all of its benefits, pg_basebackup still cannot (currently) skip unchanged files, or make small patches where necessary. Relying on it in this case would require erasing the replica and starting from scratch. This is where GitLab really ran into trouble.
Yet we started with synchronized files, didn’t we? Could we use rsync to “catch up”? Yes, but it’s a somewhat convoluted procedure. We would first need to connect to the upstream server and issue a SELECT pg_start_backup('my_backup') command so Postgres knows to archive transaction logs produced during the sync. Then after the sync is completed, we would need to stop the backup with SELECT pg_stop_backup(). Then we would have to make our own recovery.conf file, obtain all of the WAL files the upstream server archived, and so on.
None of that is something a system administrator will know, and it’s fiddly even to an experienced Postgres DBA. A mistake during any of that procedure will result in a non-functional or otherwise unsafe replica. All of that is the exact reason software like Barman exists. Supplied utilities only get us so far. For larger or more critical installations, either our custom scripts must flawlessly account for every failure scenario and automate everything, or we defer to someone who already did all of that work.

In my new day job, I'm learning about the extraordinary complexities of the operational aspects of cloud computing. For a relatively-detailed exploration of the topic, there is nothing better than Google's Site Reliability Engineering book, which I wrote about a few months ago. But the topic is deep, and you can spend your entire career working in this area.

Meanwhile, in another corner of the git universe, the annual Git Merge conference is underway, and there is a lot of news there as well, including major improvements to GitLFS, and a detailed report from Facebook (who continue to use their custom version of Mercurial in preference to git).

But the big announcement, truly the centerpiece of the entire conference, came from, of all places, Microsoft:

Over the last year, we have continued to invest in Git and have lots of exciting information to share about the work we’ve done to embrace Git across the company, for teams of any size. During this talk, we plan to discuss, in depth, how we are using git internally with a specific focus on large repositories. We’ll discuss the architecture of VSTS’s git server which is built on Azure and the customizations we’ve had to make to both it and git.exe in order to enable git to scale further and further. These customizations will cover changes that we’ve contributed back to the git core open source project as well as changes that we haven’t talked about externally yet. We’ll also lay out a roadmap for the next steps that we plan to take to deal with repositories that are significantly larger than git can scale to today.

Well, that was rather a tease. So what was that "exciting information" that Microsoft promised?

Here it is: Scaling Git (and some back story).

As Brian Harry observes, to tell this story properly, you have to back up in time a bit:

We had an internal source control system called Source Depot that virtually everyone used in the early 2000’s.

(I think it's no secret; heck, it's even posted on Wikipedia, so: Source Depot is a heavily-customized version of Perforce.)

And, as Harry notes, the Microsoft Source Depot instances are the single biggest source code repositories on the planet, significantly bigger than well-known repositories such as Google's:

There aren’t many companies with code bases the size of some of ours. Windows and Office, in particular (but there are others), are massive. Thousands of engineers, millions of files, thousands of build machines constantly building it, quite honestly, it’s mind boggling. To be clear, when I refer to Window in this post, I’m actually painting a very broad brush – it’s Windows for PC, Mobile, Server, HoloLens, Xbox, IOT, and more.

I happen to have been up-code and personal with this code base, and yes: it's absolutely gigantic, and it's certainly the most critical possession that Microsoft owns.

So making the decision to change the tool that they used for this was no easy task:

TFVC and Source Depot had both been carefully optimized for huge code bases and teams. Git had *never* been applied to a problem like this (or probably even within an order of magnitude of this) and many asserted it would *never* work.
The first big debate was – how many repos do you have – one for the whole company at one extreme or one for each small component? A big spectrum. Git is proven to work extremely well for a very large number of modest repos so we spent a bunch of time exploring what it would take to factor our large codebases into lots of tenable repos. Hmm. Ever worked in a huge code base for 20 years? Ever tried to go back afterwards and decompose it into small repos? You can guess what we discovered. The code is very hard to decompose. The cost would be very high. The risk from that level of churn would be enormous. And, we really do have scenarios where a single engineer needs to make sweeping changes across a very large swath of code. Trying to coordinate that across hundreds of repos would be very problematic.

In SCM circles, this is known as "the monorepo problem."

The monorepo problem is the biggest reason why most truly large software engineering organizations have struggled to move to git. Some organizations, such as Google's Android team, have built massive software toolchains around git, but even so the results are still unsatisfactory (and astonishingly expensive in both human and computer resources).

Microsoft, of course, were fully aware of this situation, so what did they do? Well, let's switch over to Saeed Noursalehi: Announcing GVFS (Git Virtual File System)

Today, we’re introducing GVFS (Git Virtual File System), which virtualizes the file system beneath your repo and makes it appear as though all the files in your repo are present, but in reality only downloads a file the first time it is opened. GVFS also actively manages how much of the repo Git has to consider in operations like checkout and status, since any file that has not been hydrated can be safely ignored. And because we do this all at the file system level, your IDEs and build tools don’t need to change at all!
In a repo that is this large, no developer builds the entire source tree. Instead, they typically download the build outputs from the most recent official build, and only build a small portion of the sources related to the area they are modifying. Therefore, even though there are over 3 million files in the repo, a typical developer will only need to download and use about 50-100K of those files.
With GVFS, this means that they now have a Git experience that is much more manageable: clone now takes a few minutes instead of 12+ hours, checkout takes 30 seconds instead of 2-3 hours, and status takes 4-5 seconds instead of 10 minutes.

As Harry points out, this is very very complex engineering, and required solving some very tricky problems:

The file system driver basically virtualizes 2 things:

The .git folder – This is where all your pack files, history, etc. are stored. It’s the “whole thing” by default. We virtualized this to pull down only the files we needed when we needed them.

The “working directory” – the place you go to actually edit your source, build it, etc. GVFS monitors the working directory and automatically “checks out” any file that you touch making it feel like all the files are there but not paying the cost unless you actually access them.

As we progressed, as you’d imagine, we learned a lot. Among them, we learned the Git server has to be smart. It has to pack the Git files in an optimal fashion so that it doesn’t have to send more to the client than absolutely necessary – think of it as optimizing locality of reference. So we made lots of enhancements to the Team Services/TFS Git server. We also discovered that Git has lots of scenarios where it touches stuff it really doesn’t need to. This never really mattered before because it was all local and used for modestly sized repos so it was fast – but when touching it means downloading it from the server or scanning 6,000,000 files, uh oh. So we’ve been investing heavily in is performance optimizations to Git. Many of them also benefit “normal” repos to some degree but they are critical for mega repos.

But even more remarkably, Microsoft is GIVING THIS AWAY TO THE WORLD:

While GVFS is still in progress, we’re excited to announce that we are open sourcing the client code at https://github.com/Microsoft/gvfs. Feel free to give it a try, but please be aware that it still relies on a pre-release file system driver. The driver binaries are also available for preview as a NuGet package, and your best bet is to play with GVFS in a VM and not in any production environment.
In addition to the GVFS sources, we’ve also made some changes to Git to allow it to work well on a GVFS-backed repo, and those sources are available at https://github.com/Microsoft/git. And lastly, GVFS relies on a protocol extension that any service can implement; the protocol is available at https://github.com/Microsoft/gvfs/blob/master/Protocol.md.

Remember, this is Microsoft.

And, this is git.

But, together, it just all changed:

So, fast forward to today. It works! We have all the code from 40+ Windows Source Depot servers in a single Git repo hosted on VS Team Services – and it’s very usable. You can enlist in a few minutes and do all your normal Git operations in seconds. And, for all intents and purposes, it’s transparent. It’s just Git. Your devs keep working the way they work, using the tools they use. Your builds just work. Etc. It’s pretty frick’n amazing. Magic!
As a side effect, this approach also has some very nice characteristics for large binary files. It doesn’t extend Git with a new mechanism like LFS does, no turds, etc. It allows you to treat large binary files like any other file but it only downloads the blobs you actually ever touch.

To say that this is a sea change, a complete reversal of everything you might possibly have expected, is certainly understating the case.

Let's review:

Microsoft showed up at one of the largest open-source-free-software-community professional conferences
To talk about their work using the open source community's dearest-to-the-heart tool (git)
And not only did Microsoft not disparage the tool, they actively celebrated it
and added a massive, massive new feature to it
and revealed that, as of now and with that feature, they're actually using git themselves, for ALL of their own tens of thousands of users, on the LARGEST source base on the planet, in a SINGLE monolithic git repo
And gave that tool away, back to the open source community, on GitHub!

So, anyway, this is just a little corner of the industry (even if it did consume an entire third of my professional career, and spawn the largest IPO of the last 2 years).

But, for those of you who care, take notice: the entire SCM industry just changed today.

Stuff I'm reading, early February edition

It's raining.

That's good.

And, also good, the rain doesn't keep me away from duh 'netz...

Unexpected Consequences of Self Driving Cars
[1] Interestingly, many pedestrians reward good behavior by drivers. Getting on the main street or off of the main street from or onto a small side street can often be tricky for a driver. There are often so many people on the sidewalks that there is a constant flow of foot traffic crossing the exits or entrances of the side streets. Drivers have to be patient and ready for a long wait to find a break. Often pedestrians who have seen how patient a driver is being will voluntarily not step into the cross walk, and either with a head or hand signal indicate to a driver that they should head through the crossing. And if the driver doesn’t respond they make the signal again–the pedestrian has given the turn to the driver and expects them to take it.
...
[2] This is one for the two (autonomous) car family. Suppose someone is going to an event in the evening and there is not much parking nearby. And suppose autonomous cars are now always prowling neighborhoods waiting for their owners to summon them, so it takes a while for any particular car to get through the traffic to the pick up location. Then the two car family may resort to a new trick so that they don’t have to wait quite so long as others for their cars to get to the front door pick up at the conclusion of the big social event. They send one of their cars earlier in the day to find the closest parking spot that it can, and it settles in for a long wait. They use their second car to drop them at the event and send it home immediately. When the event is over their first autonomous car is right there waiting for them–the cost to the commons was a parking spot occupied all day by one of their cars.
How Google fought back against a crippling IoT-powered botnet and won
In September, KrebsOnSecurity—arguably the Internet's most intrepid source of security news—was on the receiving end of some of the biggest distributed denial-of-service attacks ever recorded. The site soon went dark after Akamai said it would no longer provide the site with free protection, and no other DDoS mitigation services came forward to volunteer their services. A Google-operated service called Project Shield ultimately brought KrebsOnSecurity back online and has been protecting the site ever since.
At the Enigma security conference on Wednesday, a Google security engineer described some of the behind-the-scenes events that occurred shortly after Krebs asked the service for help, and in the months since, they said yes.
How Is 'Non-Literally Copying' Code Still Copyright Infringement?
The notion of "non-literal copying" as applied to code is a weird one, and casts a light on how weird code copyright is to begin with. If copyright isn't supposed to cover functional choices, how can it be infringing to create new code that accomplishes the same function in a slightly different way? Are juries supposed to determine which "non-literally copied" aspects of the code were aesthetic, and which were purely functional? This sort of idea-expression divide question is muddy in the worlds of art and literature, but it should be simple in the world of code: what a program does is not covered by copyright, nor are any purely functional elements of how it achieves that.
The Zenimax vs Oculus trial is over
There are objective measures of code similarity that can be quoted, like the edit distance between abstract syntax trees, but here the expert hand identified the abstract steps that the code fragments were performing, made slides that nobody in the courtroom could actually read, filled with colored boxes outlining the purportedly analogous code in each case. In some cases, the abstractions he came up with were longer than the actual code they were supposed to be abstracting.
It was ridiculous. Even without being able to read the code on the slides, you could tell the steps varied widely in operation count, were often split up and in different order, and just looked different.
The following week, our side’s code expert basically just took the same slides their expert produced (the judge had to order them to be turned over) and blew each of them up across several slides so you could actually read them. I had hoped that would have demolished the credibility of the testimony, but I guess I overestimated the impact.
Notably, I wasn’t allowed to read the full expert report, only listen to him in trial, and even his expert testimony in trial is under seal, rather than in the public record. This is surely intentional -- if the code examples were released publicly, the internet would have viciously mocked the analysis. I still have a level of morbid curiosity about the several hundred-page report.
Back-to-Basics Weekend Reading - Bloom Filters
Bloom Filters, conceived by Burton Bloom in 1970, are probabilistic data structures to test whether an item is in a set. False positives are possible, but false negatives are not. Meaning, if a bit in the filter is not set, you can be sure the item is not in the set. If it is in the set, the mapped item may be in the set.
This is a hugely important technique if you need to process and track massive amounts of unique data units, as it is very space-efficient. From Dynamo and Postgresql, to HBase and Bitcoin, Bloom Filters are used in almost all modern distributed systems.
Addressing 2016
Obviously, the device population of the Internet continues to grow but it appears that most of the growth of the network is occurring behind various forms of IPv4 Network Address Translators (NATs). These devices are then largely invisible to the public network, so efforts to track their population are challenging. The deployment of these devices behind NATS places very little in the way of pressures on address consumption. While the Internet may have absorbed in 2016 a production quantity of some 270 million personal computers, 1.8 billion smart phones and a further 1.8 billion connected devices, that does not mean that there has been a demand for some 4 billion additional IP addresses. Part of this volume has replaced older equipment, and almost all these additional devices find themselves positioned behind NATs, making only minor demands on the overall address structure. The total drain on the remaining unallocated IPv4 address pool was just 22 million addresses for 2016.
This was the issue that IPv6 was primarily intended to solve. The copious volumes of address space were intended to allow us to uniquely assign a public IPv6 address to every such device, no matter how small, or in what volume they might be deployed. Why this has not happened so far, and why we are still concentrating a significant proportion of our efforts on stretching IPv4 to encompass ever larger population of attached devices is a critical question.
Top ten pull request review mistakes
Diffs are really great for showing you what has changed. But that’s the thing! By definition they don’t show you what hasn’t changed. Be on the lookout for changes which should have been applied more widely, like a find/replace that maybe didn’t cover the entire codebase.
Or a change that only hit a subset of the components it should have.
About <Programming>
The International Conference on the Art, Science, and Engineering of Programming is a new conference focused on everything to do with programming including the experience of programming. We’ve named it <Programming> for short. Papers are welcome from any part of the programming research lifecycle, as are papers on programming practice and experience.
Things Every Hacker Once Knew
This document is a collection of facts about ASCII and related technologies, notably hardware terminals and RS-232 and modems. This is lore that was at one time near-universal and is no longer. It’s not likely to be directly useful today - until you trip over some piece of still-functioning technology where it’s relevant (like a GPS puck), or it makes sense of some old-fart war story. Even so, it’s good to know anyway, for cultural-literacy reasons.
What's with the "programming test"!?
knowing how to implement a file copy function without an understanding of why you would implement a file copy function is where the red ink is set free. I’m appalled that companies still see the “programming test” as a viable way to vet talent. In fact, unless the job you’re going for involves writing software that directly talks to the file system as part of its core capabilities, then the correct response from a candidate should be, “Um, I’d just use a framework”. That’s why development communities far and wide have frameworks; it’s so you don’t have to give a toss about System.IO.File blah blah blah during a job interview!
Stupid Interview Questions
Look, I’m sorry if you feel put-upon here, please don’t get hostile. I’m just trying to get a clear picture of the specs I need here. Obviously if I’m going to write a file copy method, instead of using one of the many extant file copy routines in various libraries and frameworks, it’s going to be fulfilling a specialized set of requirements, and I’m going to need to have good answers for these questions. If you want, I can hack something together in a minute, but I’d have to note that there were many unresolved issues as to requirements and purposes.
The Trillion Internet Observations Showing How Global Sleep Patterns Are Changing
The new database also allowed the team to study global sleep patterns. They did this by assuming that the switch from a device being online to offline corresponds with a person going to sleep (and vice versa). “The association need not be exact, instead a systematically leading or lagging relationship carries the required information,” say Ackermann and co. They then crunch the data for people in more than 600 cities around the world (having calibrated it against data gathered by the American Time Use Survey).
The result is the first global estimate of overnight sleep duration in 645 cities over seven years, and it makes for interesting reading. “In general, major cities tend to have longer sleeping times compared to surrounding satellite cities,” say the team.
But they say there is evidence that sleep patterns are changing, perhaps due to technology use. “Whilst North America has remained largely static over the study window, Europe sleep duration has declined, and East Asian sleep duration has grown,” they say. By this reckoning, global sleep patterns are converging. Exactly why is a fascinating open question.
Sinking Millennium Tower safe to live in, city report concludes
The report, based on inspections on Dec. 2, 2016, and Jan. 11, found “visual evidence of the effect of settlement on some electrical wiring support systems.”
The report noted that electrical “raceways” — large tubes that serve as conduits for wires and cables — are under stress in an area of the basement where the 58-story high-rise part of the structure meets the adjacent 12-story podium. Inspectors also found “evidence of water intrusion” in the fifth level of the basement “affecting installed electrical wiring and electrical components.”
The report concluded that electrical systems were “working adequately” and that the deficiencies were “not inherent to the electrical systems installed but represent the need for an engineering solution to the building settlement issues.”
Canada Will Not Go Gently
None of this is new to us. Racial intolerance and ethnic hatred lies at the very foundation of Canadian democracy. We are not strangers to it. Far from being untouched by it, its scars run deep and are still raw. Our nation was founded by white men of property who restricted the franchise to others like them. In the early years of this province, these same men kept themselves in power through violence and fear. They jailed dissenters, hanged traitors, and worked to exclude others from the protection of the law. As explored in this recent piece in The Walrus, as the nation grew up, the same brutal, racist tendencies that motivated Canada to engage in a cross-generational genocide of its First Nations peoples visited indignities on the Irish, blacks, Eastern Europeans, Chinese, Japanese, Sikhs, Jews, and Tamils who tried to settle here. We had our share of racial fear-mongering and riots, of racist bans and head-taxes. We worked to keep the others out. We made it hard for immigrants to succeed and thrive.
But, having just opened my third pint, I am now going to self-righteously assert that this has always been a country that was founded on a set of core legal principles, on strong institutions and a deep and stubborn commitment to the rule of law. Maybe it is the cold, but unlike the warmer climes to the south, we’ve tended to draw together to survive, to work to heal our wounds over time. And while our courts, our governments, our civic institutions, our public schools and our citizens, have had their weak moments, they’ve come to lead the charge against intolerance. They’ve been at the frontlines of fighting racism. I’m not saying its perfect here. I’m not saying we don’t have issues with racism and intolerance. I’m not saying there do not remain some painful and unresolved legacies. But we’ve developed the habit of working through them. Slowly, sometimes too slowly, but we work through them.

And, finally: alas, Schrodinger's Cat Carrier is sold out.

Hazard wondergoal

If you have no idea who Eden Hazard is, well: pay attention, as the young 26-year-old Belgian footballer is surely one of the greatest athletes you can watch, in any sport, these days.

His recent goal v. Arsenal is astonishing!

He takes the ball in his own half, and single-handedly races half the length of the field to score unassisted.

Moreover: for nearly the entire duration of his run, the camera shows nothing else but six Arsenal defenders plus Peter Cech (who's as much a spectator in this whole thing as me).

Not a single other Chelsea teammate is able even to be visible in the camera's lens until after the ball has already left Hazard's foot on the final strike.

That view is amazing: a sea of red, but no fear from Hazard. An ordinary player would surely have aborted the run and taken the ball to the side, but no: Hazard takes on two defenders simultaneously in the penalty box, calmly feints one to the left and the other to the right, then blasts the ball past Mr. Rooted-in-the-ground Cech, and it's over.

I love the part when someone (I think it's maybe Coquelin, #34?) climbs on Hazard's back and goes for about a 5-meter ride, then is shrugged to the side as Hazard plows on.

Superb!

If Chelsea continue to play at this level for the remainder of the season, their final title will be a well-deserved one.

Wednesday, February 1, 2017

Ordinary Patriots: a very short review

Historical fiction is an odd sort of thing, for it's neither history, nor fiction, but somewhere in between.

Handled well, it can be better than either history or fiction.

Handled poorly, it can act as a crutch, a barrier, or a hobble, leading to a result that is lacking both as history and as fiction.

So it must be with some trepidation that an author who has previously demonstrated success with both history and fiction, separately, tries her hand at historical fiction.

I'm pleased to report that Ordinary Patriots: A Novel of the Eastern Sierra Nevada, the second in the (ongoing, I hope?) saga of the Richardson family of Owens Valley, strikes me as a resounding success.

In this case, although there is certainly history (World War II, in particular, sets the stage and frames all the action), what really appeals to me about Ordinary Patriots is its vivid sense of place.

Some geographies are, all on their own, so vivid and dramatic as to be story-worthy by themselves, and California's Eastern Sierra is certainly such a place: remote, harsh, almost unlivable, it is one of those landscapes that just forces adventure to emerge, literally out of the thin desert air.

But to write about a place, you have to have lived in that place; you can't just conjure up a sense of place out of your imagination.

So, here:

Spring in the Owens Valley is the most challenging time of the year. Temperature obeys no laws -- hot, dry days may be followed by gray days leading to showers or actual rain storms, before the endless sunshine of the desert summer begins. The winds can be ferocious. Sometimes it blows so strongly that a grown man must stop walking and simply lean into it before proceeding. Sometimes it will pick up sand and scour cars and windows and skin.

And yet also, here:

Spring is the best season for people living in the desert. The sun warms the sand and melts the little patches of snow and ice remaining in sheltered canyon corners and under creosote bushes and evergreens. Wildflowers begin to burst out, yellow and white and purple each on their own timetable. The desert tortoises can be seen occasionally, tiptoeing across the hot sand. The ranchers begin to move their animals in the direction of summer pastures, but at a leisurely pace. And school is out, for at least a week and two weekends, so the main street in Independence is occupied by boys on bicycles.

You take the bad with the good; it can't be otherwise, if it's a real place.

Surely, most of Ordinary Patriots is a story about people, both real people (Father Crowley, Jacqueline Cochran, Ansel Adams) and fictional ones, like the Richardson clan and their neighbors and friends.

And it's a story about events, which, quite rightly for this sort of story, are real events (mining accidents in the mountains, the establishment of the Manzanar internment camp, the changes wrought by technology upon the world and its people) primarily, although sprinkled with a fair share of imaginative filling-in-of-the-blanks as well.

But my favorite aspect of Ordinary Patriots was how it made me feel like I was there: getting a pastry at the local bakery, driving the dusty desert roads, looking up (straight up!) at the majestic mountaintops, hunching my shoulders as I walked along in the howling desert gale.

If you've ever thought you might be a desert rat, or might want to know what desert rats are like, give Ordinary Patriots a try: it's quite a success and certainly worth your time.

One good thing about commuting on public transit.

Startlingly, I realized the other day that it had been more than a decade since I had been routinely commuting on public transit.

And one thing that I had forgotten about commuting on public transit is how wonderful it is to have a significant, regular, uninterrupted span of time to just sit and read a book.

Or, for those times when I'm on BART, to just stand and read a book.

I used to read books ALL the time on public transit, but then, as I say, I went for more than a decade without doing that.

And I rather stopped reading. Well, not entirely, but I definitely spent a lot less time reading.

So it is WONDERFUL to be back in a situation where I am regularly reading again.

Heck, it's not uncommon for me to make it through an entire book in a week!

Now, there are some downsides to public transit, of course.

But reading a book is, for certain, a very big upside.

Journal of a Programmer