Sunday, October 26, 2014

And some software engineering things, too

Because, you know, that's just who I am.

  • FIT : Failure Injection Testing
    Simulating failure starts when the FIT service pushes failure simulation metadata to Zuul. Requests matching the failure scope at Zuul are decorated with failure. This may be an added delay to a service call, or failure in reaching the persistence layer. Each injection point touched checks the request context to determine if there is a failure for that specific component. If found, the injection point simulates that failure appropriately. Below is an outline of a simulated failure, demonstrating some of the inflection points in which failure can be injected.
  • Ice Cream and Distributed Systems
    Mary, Mom and Dad sat down and tried to figure out how to all agree on the problem with the fewest number of messages. Mary invented a simple scheme: when I asked her if I could have some ice cream, she messaged both my mom and dad and ask for their opinion, while asking that they didn't change their opinion until hearing back from her. If they both agreed, she'd go ahead and let them know she was going to serve dessert. If either said no, she let them know that the bowl would remain empty. The protocol, which they called two-phase commit after the frozen and liquid phases of ice cream, took four messages to complete.
  • Cuckoo Filters
    If you're going to use multiple choice hashing schemes, though, you should think about using cuckoo hashing. The ability to move keys around means you should get better space utilization; for example, even with 2 choices, if your buckets can hold 4 items, cuckoo hashing can get you about 95% space utilization. The problem with cuckoo hashing in this setting is that, for a Bloom filter, you want to just keep fingerprints of keys, not the keys themselves. So, when you want to move the key, how do you figure out where to move it to -- you no longer have the key to hash?
  • Instant Loading for Main Memory Databases
    While hardware limitations for fast loading have disappeared, current approaches for main memory databases fail to saturate the now available wire speeds of tens of Gbit / s. With Instant Loading, we contribute a novel CSV loading approach that allows scalable bulk loading at wire speed. This is achieved by optimizing all phases of loading for modern super-scalar multi-core CPUs.
  • Message Systems in Programming: Callbacks, Events, Pub Sub, Promises, and Streams
    Messaging systems are used to communicate in larger code bases by helping decouple classes that need to know about changes or happenings in certain areas of the code . One of Object Oriented Programming‘s core concepts is encapsulation. How you decide to allow objects to talk to each other has pro’s and con’s for each method and it’s good to know your options as you can use many together in effective hybrid approaches.

    This article will cover the 5 common ones you’ll often encounter.

  • Amazon Kinesis and Apache Storm: Building a Real-Time Sliding-Window Dashboard over Streaming Data
    In this whitepaper, we propose a reference architecture for ingesting, analyzing, and processing vast amounts of clickstream data generated at very high rates in a smart and cost-efficient way using Amazon Kinesis with Apache Storm. We also explore the use of Amazon ElastiCache (Redis) as an in-memory data store for aggregated counters and use of its Pub/Sub facility to publish the counters on a simple dashboard.
  • Avoiding the tragedy of the anticommons
    In his white paper for the Bio-Commons, RĂ¼diger Trojok writes about a significantly more ambitious vision for open biology: a bio-commons that holds biological intellectual property in trust for the good of all. He also articulates the tragedy of the anticommons, the nightmarish opposite of a bio-commons in which progress is difficult or impossible because “ambiguous and competing intellectual property claims…deter sharing and weaken investment incentives.” Each individual piece of intellectual property is carefully groomed and preserved, but it’s impossible to combine the elements; it’s like a jigsaw puzzle, in which every piece is locked in a separate safe.
  • Which Online Discussion Archetype Are You?
    What Mike created is a brilliant deconstruction of the various archetypes you'll encounter in any long running discussion group
  • 10 Tricks to Appear Smart During Meetings
    Opinions and data and milestones are being thrown around and you don’t know your CTA from your OTA. This is a great point to go, “Guys, guys, guys, can we take a step back here?” Everyone will turn their heads toward you, amazed at your ability to silence the fray. Follow it up with a quick, “What problem are we really trying to solve?” and, boom! You’ve bought yourself another hour of looking smart.
  • 15 Tricks to Appear Smart in Emails
    Whenever something good happens, always be the first to respond and always reply all. This will make you seem like a highly engaged team player.

Completely non-software-engineering things I'm reading

It rained a little bit yesterday. Nothing like it's been raining in Oregon and Washington, but maybe it's a start. I can see from the chart that Lake Shasta is still falling, but yesterday, for the first day in a long time, inflow exceeded outflow.

  • The Astonishing Story of the Federal Reserve on 9-11
    I had planned to spend this week on the thrilling topic of the discount window. It was plain old curiosity that took me to the internet to find out what the Federal Reserve did on 9-11. As it turns out, it was not an easy story to unravel and between late Sunday night when I first started reading and Tuesday night when I started writing I read several hundred pages of reports as well as the tiny amount of media reporting available. Here’s the thing I didn’t know and I’ll bet you a wheelbarrow of carrots you didn’t either, on 9-11 and the days which immediately followed, a relatively small number of people did some genuinely, physically heroic things in order to keep the economy from going off the rails and none of them were named Alan Greenspan.
  • “Mount Thoreau” and the naming of things in the wilderness.
    And across Piute Canyon from it there stands another big peak, unnamed. On the maps it’s marked 12,691. If named after Thoreau, the two peaks would then form a gateway, like Scylla and Charybdis, through which hundreds of hikers would pass every year. Peak 12,691 is somewhat lower than Mount Emerson, but much more gnarly and interesting; the two peaks have much the same relationship that Emerson and Thoreau had, not just in size and aspect but in position, being close to each other but separated by a huge gulf of air. It was just like that in Concord.
  • Expert Critique of Burmese Cat Project
    However, keeping a colony of 40 cats is a vastly different proposition from keeping two or three cats in a home environment. With such a large colony, it is vitally important from a health perspective that cats are kept in a fresh, breezy environment at all times. I indicated that the solution would be to build an enclosure that surrounded Heritage House from water level to tree top and a shade cloth roof to provide some shade and protection from the rain.
  • Starship Size Comparison Chart
    Scale: 1 pixel = 10 meters
  • How Rebounds Work
    Much has been made about the player-tracking revolution in the NBA and how it will advance the state of basketball analytics. This is truly a brave new world; to date, a vast majority of the energy spent researching advancements has been aimed at developing richer characterizations of player performance and constructing newfangled scouting reports. That makes sense, but basketball is bigger than any one player or team, and it’s also important to realize that the same data set that tells us Chandler Parsons and Jimmy Butler ran a lot, or Patty Mills runs the fastest, also holds incredible information about how basketball works. This goes beyond properly evaluating individuals; we are on our way to being able to map basketball itself. This work will eventually help coaches, players, and press more elegantly understand ball movement, defensive positioning, offensive architecture, and, yes, rebounding.
  • What A Former Olympian And NFL Player Can Teach Us About Advertising And Marketing
    I’ve seen firsthand in football and business how victims can bring down the morale of an entire team. It’s impossible to build anything with a victim mentality.
  • FORGET VIDEO GAMES: Here's What It's Like To Put On A Costume And Go Live-Action Role Playing
    Live-action role-playing (or LARPing) was born on the fringes of American pop culture, a descendant of much-maligned hobbies like Dungeons and Dragons and other table games.

    In LARPing, players spend their weekend dressing up in costumes, adopting elaborate personae, and inhabiting a complex imagined world.

I don't have a Halloween costume this year. Maybe I'll go as Programmer of a Certain Age.

Saturday, October 25, 2014

From the Repertoire

I've been thoroughly enjoying my second taste of the online music classes developed by the Curtis Institute of Music.

Some time ago, I followed Jonathan Biss's delightful Exploring Beethoven’s Piano Sonatas.

This fall, I've been taking Jonathan Coopersmith's superb From the Repertoire: Western Music History through Performance.

I like the way that this class moves through a selection of different music from different time periods and schools, so that the class is always varied and never dull.

I also really like that Coopersmith's class is illustrated by performances by the Curtis Institute students themselves, which makes the music feel much more alive than watching some much older video-taped presentation, great though those older performances may be.

The classes are certainly aimed at a much more serious student of music than I am, but they are at a level where even a casual listener such as myself can enjoy them and learn.

Meanwhile, since I do know next to nothing about music theory, I'm happy to have stumbled upon Toby Rush's wonderful Music Theory for Musicians and Normal People. Rush's presentation style is delightful, the poster format works very well (for me, at least), and the individual lessons are presented in small digestable amounts, which fits my stupidly busy schedule.

I still don't understand why Coursera insists on operating these classes on a fixed schedule; it seems like a student such as myself, who has the time only to watch the videos, read the background materials, and listen to the performances, should be able to start such a class at any time. The computers don't care, after all; they have no notion of what day or month or year it is.

But no, currently you can only take Coopersmith's class, not Biss's class nor Steinhardt's.

Oh well, some mysteries are not to be solved, and I have more of Coopersmith to listen to now.

Sunday, October 19, 2014

A day at ARK 2000

We had the opportunity to spend a glorious day at ARK 2000, which is one of the facilities of a rather unusual organization called the Performing Animal Welfare Society.

Through the generosity of friends, we found ourselves with a pair of tickets to one of PAWS's annual fund-raisers, the "Elephant Grape Stomp." This event is sort of an open house to visit the sanctuary, which is located in the Sierra Nevada foothills, about 2 hours from our house.

During the event, we were able to visit three parts of the sanctuary:

  • The cats and bears area, which holds Siberian Tigers, African Lions, and American Black Bears, as well as at least one leopard (who was feeling unsocial so we didn't see her).
  • Bull Mountain, where PAWS has a facility for two male Asian Elephants (held separately, but adjacently)
  • The Asian and African Elephant compound, where about 10 female elephants are living in two separate areas.

At all three locations, booths were set up with information, local restaurants were serving delicious food, and local wineries (from the thriving Murphy's wine region) were pouring scrumptious Sierra Nevada wines.

Visiting ARK 2000 is sort of an unusual experience.

It is not a zoo, and the animals are not there to entertain you.

And it's not a breeding facility; they aren't trying to produce more of these animals here.

I would say it's more like a senior citizen facility for animals who have been taken from rather difficult circumstances and given a dramatically more humane situation in which to live out their lives.

Still, it was very nice and peaceful. The weather was superb, and we had all the time we wanted to stand quietly and watch the animals as they relaxed, contentedly, in their space.

Several of the staff were on hand, including the primary elephant keeper and the primary bear keeper, to answer questions and help explain what we were seeing and why.

And some of the sights were indeed unusual, such as the three custom transport containers that they use to move the elephants long distances (most recently used to bring three elephants from Toronto to California). This is not the sort of item you can get at your local hardware store!

For example, keeping bull elephants is rather different than keeping female elephants. The extraordinary strength and aggressive tendencies of the bull elephants means that they must be located in a particular situation, with a pen of fantastic strength. In some of the pictures, you can see, I think, the difference in the containment fences for the male elephant as opposed to those for the females. (Of course, the females are plenty strong enough; apparently they like to uproot the oak trees just for fun, and so the facility has built massive protective cages around some of the trees to try to keep the ladies from clearing them out entirely.)

I think the highlight, for me, were the 4 Siberian Tigers, absolutely majestic animals, who were all together in one pen and were particularly active, bounding around their space, playing together, alertly aware of everything and everyone that was around them.

There's lots of information about PAWS on their website. It's not obvious what is going to come of the organization now that its founder, Pat Derby, has passed on. Still, from all evidence they are still going strong, and hopefully they will find a new generation to continue their excellent work.

Thursday, October 16, 2014

He had me at "the Largest Ship in the World"

Don't miss Alastair Philip Wiper's photo-journalism essay about the building of the new Maersk Triple-E container vessels: Building the Largest Ship In the World, South Korea

The Daewoo Shipbuilding and Marine Engineering (DSME) shipyard in South Korea is the second largest shipbuilder in the world and one of the “Big Three” shipyards of South Korea, along with the Hyundai and Samsung shipyards. The shipyard, about an hour from Busan in the south of the country, employs about 46,000 people, and could reasonably be described as the worlds biggest Legoland. Smiling workers cycle around the huge shipyard as massive, abstractly over proportioned chunks of ships are craned around and set into place: the Triple E is just one small part of the output of the shipyard, as around 100 other vessels including oil rigs are in various stages of completion at the any time.

Wednesday, October 15, 2014

Stuff I'm reading, mid-October edition

There was wind last night, but no rain.

Rain to the north, they say.

But not here.

  • Harvest and Yield: Not A Natural Cure for Tradeoff Confusion
    Yield is the availability metric that most practitioners end up working with, and it's worth noting that its different from CAP's A. The authors don't define it formally, but treat it as a long-term probability of response rather than the probability of a response conditioned on there being a failure. That's a good common-sense definition, and one that fits well with the way that most practitioners think about availability.
  • Apple's "Warrant-Proof" Encryption
    Code is often buggy and insecure; the more code a system has, the less likely it is to be secure. This is an argument that has been made many times in this very context, ranging from debates over the Clipper Chip and key escrow in the 1990s to a recent paper by myself, Matt, Susan Landau, and Sandy Clark. The number of failures in such systems has been considerable; while it is certainly possible to write more secure code, there's no reason to think that Apple has done so here. (There's a brand-new report of a serious security hole in iOS.) Writing secure code is hard. The existence of the back door, then, enables certain crimes: computer crimes. Add to that the fact that the new version of iOS will include payment mechanisms and we see the risk of financial crimes as well.
  • Keyless SSL: The Nitty Gritty Technical Details
    Extending the TLS handshake in this way required changes to the NGINX server and OpenSSL to make the private key operation both remote and non-blocking (so NGINX can continue with other requests while waiting for the key server). Both the NGINX/OpenSSL changes, the protocol between the CloudFlare’s server, and the key server were audited by iSEC Partners and Matasano Security. They found the security of Keyless SSL equivalent to on-premise SSL. Keyless SSL has also been studied by academic researchers from both provable security and performance angles.
  • Intel® SGX for Dummies (Intel® SGX Design Objectives)
    At its root, Intel® SGX is a set of new CPU instructions that can be used by applications to set aside private regions of code and data. But looking at the technology upward from the instructions is analogous to trying to describe an animal by examining its DNA chain. In this short post I will try to uplevel things a bit by outlining the objectives that guided the design of Intel® SGX and provide some more detail on two of the objectives.
  • Ads Don't Work That Way
    The key differentiating factor between the two mechanisms (inception and imprinting) is how conspicuous the ad needs to be. Insofar as an ad works by inception, its effect takes place entirely between the ad and an individual viewer; the ad doesn't need to be conspicuous at all. On the other hand, for an ad to work by cultural imprinting, it needs to be placed in a conspicuous location, where viewers will see it and know that others are seeing it too.
  • The ultimate weapon against GamerGate time-wasters: a 1960s chat bot that wastes their time
    Alan Turing proposed that an artificial intelligence qualified as a capable of thought if a human subject, in conversation with it and another human, cannot tell them apart; the strange thing about the Eliza Twitter bot is it doesn't come across as any more like a machine than those who keep repeating their points over and over and over, ad nauseum. It's difficult to decide who's failed the Turing test here.
  • Gabriel Knight’s Creator Releases Incredible 20th Anniversary Remake
    Staring at the remake version brings all those old memories of DOS mouse drivers and command prompts flooding back. Gazing at protagonist Gabriel Knight’s dazzling, polychromatic bookstore (your base of operations in New Orleans as the game begins) is like seeing the mental interpolation your brain made of the original pixelated wash beautifully, if weirdly, reified.
  • Bridge Troll
    I know this sounds a bit crazy, but trust me, there’s a troll up there! He or she, it’s tough to tell the gender of trolls, is approximately two feet tall, made of steel, and perched atop the southern end of the transverse concrete beam where the eastern cable makes contact with the road deck. The troll cannot be seen by car or from the bike path next to the bridge—you need to be underneath the bridge, on a boat to actually see the bridge troll.
  • Don’t Mourn the Passing of the New York Times Chess Column
    If those who know enough about the game to understand the diagrams in a newspaper chess column can access thousands of times more information, free and instantly, than a weekly column could possibly provide, then why run one at all? The answer is that most weekly newspaper chess columns don’t need to exist and won’t in the near future. The one exception: when there’s an excellent writer and chess professional at the helm, someone like Robert Byrne.
  • Serbia vs. Albania in Belgrade brings their troubled history to the fore
    But even if football takes the headlines, there is still the sense that Tuesday night might be an opportunity missed. On October 22, Albanian Prime Minister Edi Rama will visit Belgrade to discuss bilateral relations with his Serbian counterpart, Aleksandar Vukic. No Albanian leader has visited Belgrade since Enver Hoxha in 1946.

    It is significant, and maybe it brings a glimmer of hope that a repeat of Tuesday's fixture might one day be all about the game instead. Having a harmonious football match to oil the conversation would have done little harm, but the anticipation of that noticeable absense inside Partizan Stadium stands as a reminder that sport does not always have the power to untangle wider complexities.

  • In Transition
    We picked 10 of the most progressive skaters to choose one location each and film a full part.
  • Things I Won't Work With: Dioxygen Difluoride
    The paper goes on to react FOOF with everything else you wouldn't react it with: ammonia ("vigorous", this at 100K), water ice (explosion, natch), chlorine ("violent explosion", so he added it more slowly the second time), red phosphorus (not good), bromine fluoride, chlorine trifluoride (say what?), perchloryl fluoride (!), tetrafluorohydrazine (how on Earth. . .), and on, and on. If the paper weren't laid out in complete grammatical sentences and published in JACS, you'd swear it was the work of a violent lunatic. I ran out of vulgar expletives after the second page. A. G. Streng, folks, absolutely takes the corrosive exploding cake, and I have to tip my asbestos-lined titanium hat to him.

Sunday, October 12, 2014

Link Clearance

Man it was hot today. Doesn't fall mean that it starts to cool down?

  • The Physics of Doing an Ollie on a Skateboard, or, the Science of Why I Can’t Skate
    So here’s a thought – maybe I can use physics to learn how to do an ollie. Here’s the plan. I’m going to open up the above video of skateboarder Adam Shomsky doing an ollie, filmed in glorious 1000 frames-per-second slow motion, and analyze it in the open source physics video analysis tool Tracker.
  • 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI '14).
    As part of our commitment to open access, the proceedings from the Symposium are now free and openly accessible via the technical sessions Web page.
  • The Horror of a 'Secure Golden Key'
    A “golden key” is just another, more pleasant, word for a backdoor—something that allows people access to your data without going through you directly. This backdoor would, by design, allow Apple and Google to view your password-protected files if they received a subpoena or some other government directive. You'd pick your own password for when you needed your data, but the companies would also get one, of their choosing. With it, they could open any of your docs: your photos, your messages, your diary, whatever.
  • Malware needs to know if it's in the Matrix
    A presentation from UCSM's professor Giovanni Vigna (who runs the Center for CyberSecurity and Seclab), he's seeing more and more malware that keeps its head down on new infection sites, cautiously probing the operating system to try and determine if it's running on a real computer or if it's a head in a jar, deploying all kinds of tricks to get there.
  • 44 engineering management lessons
    30. Most conflict happens because people don’t feel heard. Sit down with each person and ask them how they feel. Listen carefully. Then ask again. And again. Then summarize what they said back to them. Most of the time that will solve the problem.
  • Unlocked 10Gbps TX wirespeed smallest packet single core
    The single core 14.8Mpps performance number is an artificial benchmark performed with pktgen, which besides spinning the same packet (skb), now also notifies the NIC hardware after populating it's TX ring buffer with a "burst" of packets.
  • Redis cluster, no longer vaporware.
    The consistency model is the famous “eventual consistency” model. Basically if nodes get desynchronized because of partitions, it is guaranteed that when the partition heals, all the nodes serving a given key will agree about its value.

    However the merge strategy is “last failover wins”, so writes received during network partitions can be lost. A common example is what happens if a master is partitioned into a minority partition with clients trying to write to it. If when the partition heals, in the majority side of the partition a slave was promoted to replace this master, the writes received by the old master are lost.

  • Using Git Hunks
    Many of the git subcommands can be passed --patch or -p for short. When used with git add, we can compose a commit with exactly the changes we want, instead of just adding whole files. Once you hit enter, you get an interactive prompt where you're presented with a diff and a set of options.
  • Slasher Ghost, and Other Developments in Proof of Stake
    The fundamental problem that consensus protocols try to solve is that of creating a mechanism for growing a blockchain over time in a decentralized way that cannot easily be subverted by attackers. If a blockchain does not use a consensus protocol to regulate block creation, and simply allows anyone to add a block at any time, then an attacker or botnet with very many IP addresses could flood the network with blocks, and particularly they can use their power to perform double-spend attacks – sending a payment for a product, waiting for the payment to be confirmed in the blockchain, and then starting their own “fork” of the blockchain, substituting the payment that they made earlier with a payment to a different account controlled by themselves, and growing it longer than the original so everyone accepts this new blockchain without the payment as truth.
  • Economies of Scale in Peer-to-Peer Networks
    I've been working on P2P technology for more than 16 years, and although I believe it can be very useful in some specific cases, I'm far less enthusiastic about its potential to take over the Internet.

    Below the fold I look at some of the fundamental problems standing in the way of a P2P revolution, and in particular at the issue of economies of scale.

  • A Scalability Roadmap
    You might be surprised that old blocks aren’t needed to validate new transactions. Pieter Wuille re-architected Bitcoin Core a few releases ago so that all of the data needed to validate transactions is kept in a “UTXO” (unspent transaction output) database. The amount of historical data needed that absolutely must be stored depends on the plausible depth of a blockchain reorganization. The longest reorganization ever experienced on the main network was 24 blocks during the infamous March 11, 2013 chain fork.
  • Why the Trolls Will Always Win
    But here’s the key: it turned out he wasn’t outraged about my work. His rage was because, in his mind, my work didn’t deserve the attention. Spoiler alert: “deserve” and “attention” are at the heart.

Academic research on VCS approaches

I've been spending my time recently reading some interesting academic research papers regarding the different workflows and behaviors that arise in DVCS systems vs CVCS systems, and thought I'd share some links.

I'm not tremendously impressed with the level of sophistication of academic research into VCS functionality, but it does seem to be slowly improving and these recent papers have some interesting observations.

The best of the bunch, I think, are the papers from Christian Bird of Microsoft, which is perhaps no surprise because the industrial side of Microsoft has been doing some of the best commercial work in VCS systems recently, and Microsoft certainly has experience dealing with the issues that matter to software developers.

  • Work Practices and Challenges in Pull-Based Development: The Integrator’s Perspective
    In the pull-based development model, the integrator has the crucial role of managing and integrating contributions. This work focuses on the role of the integrator and investigates working habits and challenges alike. We set up an exploratory qualitative study involving a large-scale survey involving 749 integrators, to which we add quantitative data from the integrator’s project. Our results provide insights into the factors they consider in their decision making process to accept or reject a contribution.
  • Will My Patch Make It? And How Fast? : Case Study on the Linux Kernel
    The Linux kernel follows an extremely distributed reviewing and integration process supported by 130 developer mailing lists and a hierarchy of dozens of Git repositories for version control. Since not every patch can make it and of those that do, some patches require a lot more reviewing and integration effort than others, developers, reviewers and integrators need support for estimating which patches are worthwhile to spend effort on and which ones do not stand a chance.
  • Social Coding in GitHub: Transparency and Collaboration in an Open Software Repository
    Based on a series of in-depth interviews with central and peripheral GitHub users, we examined the value of transparency for large-scale distributed collaborations and communities of practice. We find that people make a surprisingly rich set of social inferences from the networked activity information in GitHub, such as inferring someone else’s technical goals and vision when they edit code, or guessing which of several similar projects has the best chance of thriving in the long term.
  • How Do Centralized and Distributed Version Control Systems Impact Software Changes?
    In this paper we present the first in-depth, large scale empirical study that looks at the influence of DVCS on the practice of splitting, grouping, and committing changes. We recruited 820 participants for a survey that sheds light into the practice of using DVCS.
  • Cohesive and Isolated Development with Branches
    The adoption of distributed version control (DVC), such as Git and Mercurial, in open-source software (OSS) projects has been explosive. Why is this and how are projects using DVC? This new generation of version control supports two important new features: distributed repositories and histories that preserve branches and merges. Through interviews with lead developers in OSS projects and a quantitative analysis of mined data from the histories of sixty project, we find that the vast majority of the projects now using DVC continue to use a centralized model of code sharing, while using branching much more extensively than before their transition to DVC.
  • Expectations, Outcomes, and Challenges Of Modern Code Review
    We empirically explore the motivations, challenges, and outcomes of tool-based code reviews. We observed, interviewed, and surveyed developers and managers and manually classified hundreds of review comments across diverse teams at Microsoft. Our study reveals that while finding defects remains the main motivation for review, reviews are less about defects than expected and instead provide additional benefits such as knowledge transfer, increased team awareness, and creation of alternative solutions to problems.
  • Collaboration in Software Engineering: A Roadmap
    Software engineering projects are inherently cooperative, requiring many software engineers to coordinate their efforts to produce a large software system. Integral to this effort is developing shared understanding surrounding multiple artifacts, each artifact embodying its own model, over the entire development process.
  • Is It Dangerous to Use Version Control Histories to Study Source Code Evolution?
    This allows us to answer: How much code evolution data is not stored in VCS? How much do developers intersperse refactorings and edits in the same commit? How frequently do developers fix failing tests by changing the test itself? How many changes are committed to VCS without being tested? What is the temporal and spacial locality of changes?
  • The Secret Life of Patches: A Firefox Case Study
    In this paper, we study the patch lifecycle of the Mozilla Firefox project. The model of a patch lifecycle was extracted from both the qualitative evidence of the individual processes (interviews and discussions with developers), and the quantitative assessment of the Mozilla process and practice. We contrast the lifecycle of a patch in pre- and post-rapid release development.
  • Towards a taxonomy of software change
    Previous taxonomies of software change have focused on the purpose of the change (i.e. the why) rather than the underlying mechanisms. This paper proposes a taxonomy of software change based on characterizing the mechanisms of change and the factors that influence these mechanisms.

Saturday, October 11, 2014

The Age of Miracles: a very short review

A traveling friend, passing through, gave us her copy of Karen Thompson Walker's The Age of Miracles: A Novel when she left.

The book is narrated by Julia, an 11 year old girl in sixth grade.

That's a rough time for a child: social issues; puberty and adolescence; the realization that you're not a child anymore.

When you are 11 years old, and in sixth grade, it seems like everything is changing; it seems like the world as you know it is ending.

But what if, in fact, everything is changing?

And the world as you know it is ending?

I enjoyed Walker's book and zipped right through it, and am now giving it to someone else to enjoy.

Thursday, October 9, 2014

Derby development community lunch, 2014

It was a blast to make a short trip the other day to have lunch with a number of the leading Derby developers.

We're nearing the 20th anniversary of the Derby software; I believe the earliest code (it was then known as Cloudscape) was written sometime in 1996.

Here's a fun picture I took with the Derby developers who were able to attend:

Tuesday, October 7, 2014

Linearizable Boogaloo

This might be the nerdiest video I've ever watched: Jepsen II: Linearizable Boogaloo (although that's a high bar; I watch some very nerdy videos...).

Kingsbury is known online as "Aphyr", which is the location of his superb website.

I got hooked on Kingbury's writing nearly 18 months ago, when he published a superb essay with a rather tongue-in-cheek title: Call me maybe: Carly Rae Jepsen and the perils of network partitions

I don't mind the cheekiness, for Kingsbury's analysis, and even more importantly his description skills, are first-rate.

And a certain amount of snark helps the medicine go down, to mis-quote P.L. Travers.

The bottom line is: if you consider yourself to be a server software engineer, or a system programmer, you will find Kingsbury's work both entertaining and educational.

Read his essays; watch his videos; get smarter.

Sunday, October 5, 2014

Derby NetworkServer SocketPermission

Last winter, there was a fairly large and complex Java update: Java™ SE Development Kit 7, Update 51 (JDK 7u51).

There's lots to read in that announcement, but this part particularly affects users of the Derby Network Server:

Change in Default Socket Permissions

The default socket permissions assigned to all code including untrusted code have been changed in this release. Previously, all code was able to bind any socket type to any port number greater than or equal to 1024. It is still possible to bind sockets to the ephemeral port range on each system. The exact range of ephemeral ports varies from one operating system to another, but it is typically in the high range (such as from 49152 to 65535). The new restriction is that binding sockets outside of the ephemeral range now requires an explicit permission in the system security policy.

Most applications using client tcp sockets and a security manager will not see any problem, as these typically bind to ephemeral ports anyway. Applications using datagram sockets or server tcp sockets (and a security manager) may encounter security exceptions where none were seen before. If this occurs, users should review whether the port number being requested is expected, and if this is the case, a socket permission grant can be added to the local security policy, to resolve the issue.

See 8011786 (not public).

For users of Derby, this causes the symptoms described by DERBY-6438.

There is an (Oracle, and picked up by IBM) JVM security change that requests or suggests removal or limitation of the 'range of ports' on which JVMS by default grant the "listen" permission. I cannot find details about this JVM change, but as a result of it, users that have (unknowingly) relied on this in the past will now have to modify their policy files, or Network Server will no longer work.

Happily, it's not terribly hard to modify your Java security policy to allow Derby to run again: Unable to start derby database from Netbeans 7.4

For reason of java.policy is an unix style file and read-only, I opened and edited it with notepad++ and executed as administrator (under the same java home):

C:\Program Files\Java\jdk1.7.0_51\jre\lib\security\java.policy
Add only these lines into the file after the first grant:

grant {
    permission java.net.SocketPermission "localhost:1527", "listen";
};
Save the file, which is a little tricky for reason of the permission. But if you run notepad++ or any other edit program as administrator, you can solve the problem.

In older versions of Java, that security file tended to read something like:


 // allows anyone to listen on un-privileged ports
 permission java.net.SocketPermission "localhost:1024-", "listen";

And that's why Derby used to run successfully with those older Java versions.

Saturday, October 4, 2014

Summertime reading

Boy, I'm all over the place recently.

Must be the 90 degree temperatures.

Or all the candy corn I've been eating...

  • Play the Backstay
    Simply put, the backstay can bend the mast when tightened and straighten the mast when loose. Looking specifically at the mainsail, this can affect the fullness or draft as well as the twist of the mainsail.
  • PCC: Re-architecting Congestion Control for Consistent High Performance
    The design rationale behind TCP’s hardwired mapping is to make assumptions about the packet-level events. When seeing a packet-level event, TCP assumes the network is in a certain state and thus tries to optimize the performance by triggering a predefined control behavior as response to that assumed network state. However, in real network, the observed packet-level events are often not a result of the assumed network condition. When this assumed link breaks, TCP still mechanically carries out the mismatched control response and severely degraded performance follows.
  • That's not an unreasonable approach.
    When I came up with the original approach to congestion control in TCP 30 years ago (see Internet RFCs 896 and 970), TCP behavior under load was so awful that connections would stall out and fail. The ad-hoc solutions we put in back then at least made TCP well behaved under load.
  • Why is 0x00400000 the default base address for an executable?
    In order to make context switching fast, the Windows 3.1 virtual machine manager "rounds up" the per-VM context to 4MB. It does this so that a memory context switch can be performed by simply updating a single 32-bit value in the page directory.
  • 'Bloodletting' at Downtown Project with Massive Layoffs
    For a time, it has seemed as if the growth would never stop in Downtown Las Vegas: land purchases, new businesses and development at every turn, most of it driven by Downtown Project, the redevelopment group funded by a $350 million investment from Zappos CEO Tony Hsieh.
  • Founder Suicides
    And yesterday, the suicide article - The Downtown Project Suicides: Can the Pursuit of Happiness Kill You? - appeared. It’s a rough one that talks about three suicides – Jody Sherman (4/13), Ovik Banerjee (1/14), and Matt Berman (4/14) – all people involved in the Vegas Tech phenomenon.
  • A State of Xen - Chaos Monkey & Cassandra
    “When we got the news about the emergency EC2 reboots, our jaws dropped. When we got the list of how many Cassandra nodes would be affected, I felt ill. Then I remembered all the Chaos Monkey exercises we’ve gone through. My reaction was, “Bring it on!”.” - Christos Kalantzis - Engineering Manager, Cloud Database Engineering
  • Apache Derby 10.11.1.1 released
    The Apache Derby project is pleased to announce feature release 10.11.1.1.
  • After raising $50M, Reddit forces all remote workers to relocate to SF
    Wong chimed in on Twitter with confirmation of the new employee policy, which he said was decided independent of the new investment. “Intention is to get whole team under one roof for optimal teamwork. Our goal is to retain 100 percent of the team,” he said.
  • reddit
    it’s always bothered me that users create so much of the value of sites like reddit but don’t own any of it. So, the Series B Investors are giving 10% of our shares in this round to the people in the reddit community, and I hope we increase community ownership over time. We have some creative thoughts about the mechanics of this, but it’ll take us awhile to sort through all the issues. If it works as we hope, it’s going to be really cool and hopefully a new way to think about community ownership.
  • Before the Startup
    Startups are very counterintuitive. I'm not sure why. Maybe it's just because knowledge about them hasn't permeated our culture yet. But whatever the reason, starting a startup is a task where you can't always trust your instincts.
  • HFT In My Backyard: The Office
    The Wavre tower is the third tallest structure in Belgium. Once again, I parked my car from the tower and walked through the fields. Some techs were working at the top of the 250 meter tower, but they looked tiny from my vantage point on the ground. Unlike the Houtem tower, which needs guy wires to remain erect, the Wavre tower is a beautiful “standing structure”
  • A technological solution to best execution and excessive market complexity
    We believe there is a way for regulation to be simplified and made more powerful at the same time. Trade publication standards can be created to support improved customer choice and to simplify and strengthen the market place. This would replace the need for more complex and costly regulation. Our suggestions apply to both the USA and to Europe but this paper will concentrate on the unique market structure of the U.S. equity markets after Regulation NMS (Reg NMS).
  • How does SQLite work? Part 1: pages!
    Modifying open source programs to print out debug information to understand their internals better: SO FUN.
  • Peter Thiel's Zero to One Might Be the Best Business Book I've Read
    Thiel, a founder of PayPal and the data analytics firm Palantir, might be best known for his idiosyncrasies, which helped inspire the character of Peter Gregory in the HBO series Silicon Valley. Indeed, the recipients of Thiel's donations seem torn from the pages of a Philip K. Dick novel: an anti-aging biotech firm, an organization dedicated to building ocean communities underwater, and a foundation that pays teenagers to drop out of college and start new companies. Say what you want about the Thielian future of cyborg teenagers living for 200 years in pressurized cabins under the Caribbean; this is not a man to be faulted for thinking too small.
  • The NSA And Me
    More than three decades later, the NSA, like a mom-and-pop operation that has exploded into a global industry, now employs sweeping powers of surveillance that Frank Church could scarcely have imagined in the days of wired phones and clunky typewriters. At the same time, the Senate intelligence committee he once chaired has done an about face, protecting the agencies from the public rather than the public from the agencies.
  • “Not” Neutrality?
    the reason the interconnect utilization between Level 3 and LEC1 and LEC3 improved is that these LECs forced Netflix to pay them to interconnect directly with them. And as Netflix CEO Reed Hastings has pointed out several times, Netflix didn’t do that because they were taking advantage of a highly competitive Internet marketplace. They did it because they had no choice: all third-party content that LEC broadband users want to see eventually has to go through LEC interconnection points. When the LEC tries to turn these interconnection points into Internet tollbooths there is no alternate path for the content to take to reach the consumers.
  • Thirteen Ways of Looking at Greg Maddux: A World Series Requiem
    I can't think of Jason today, or of the days leading up to his death, without thinking of Greg Maddux. And I can't think of Maddux without thinking of Verducci. With Maddux's Hall of Fame induction this summer, after having not opened the trunk in years, I cracked it open, brushed the dust off Verducci's article, and found the movie of my memory experiencing technical difficulties.

Thursday, October 2, 2014

There's just not enough time

I need some rainy days; when it's sunny and beautiful outside I find excuses not to read ...

  • Visual Explanations - Tufte's best book
    Printed with love, including pages with pasted in cutouts, this timeless book will never go out of date, and is likely to be passed on to future generations.
  • More on Facebook's "Cold Storage"
    It is this careful, organized scheduling of the system's activities at data center scale that enables the synergistic cost reductions of cheap power and space. It is, or at least may be, true that the Blu-ray disks have a 50-year lifetime but this isn't what matters. No-one expects the racks to sit in the data center for 50 years, at some point before then they will be obsoleted by some unknown new, much denser and more power-efficient cold storage medium (perhaps DNA).
  • Inside the New York Fed: Secret Recordings and a Culture Clash
    Segarra ultimately recorded about 46 hours of meetings and conversations with her colleagues. Many of these events document key moments leading to her firing. But against the backdrop of the Beim report, they also offer an intimate study of the New York Fed's culture at a pivotal moment in its effort to become a more forceful financial supervisor.
  • Microsoft Closes SVC
    Microsoft is being pressed by the shift from PC’s as the main platform—where they had an almost monopoly on the OS—to a place where there are many players in the mobile world. This means that they are less able to support research of an open kind.
  • A Perspective on Computing Research Management
    When I came to Microsoft in 1991, I had the opportunity to apply them in building the Silicon Valley research lab, although many of the same principles had already characterized Microsoft Research since its founding in 1991.
  • Loyalty Nearly Killed My Beehive
    Then, this past spring, disaster struck. The queen wasn’t laying fertilized eggs, and if I didn’t act quickly, the hive would be dead by the end of summer. Thus began a months-long struggle that I only later realized was really about loyalty: mine to the hive, and the hive’s to its queen.
  • The Mysteries of BCL Time Zone Data
    In other words, it’s always the time of day that would have occurred locally if there wasn’t a transition – in IANA time zone language, this is a “wall mode” transition, as it tells you the time you’d see on a wall clock exactly when you need to adjust it.
  • Fun (?) with GnuPG
    If the holder of the key does not do anything, the key becomes expired, and the signatures in the signed tags stops validating. Luckily, the validity of a key can be extended by the holder of the key, and once it is done, the signatures made before the key's original expiration date will continue to validate fine.
  • Eight Epic Failures of Regulating Cryptography
    If this sounds familiar, it's because regulating encryption was a monstrous proposal officially declared dead in 2001 after threatening Americans' privacy, free speech rights, and innovation for nearly a decade. But like a zombie, it's now rising from the grave, bringing the same disastrous flaws with it.
  • Report offers ideas for a Boston beset by rising seas
    A report scheduled to be released Tuesday about preparing Boston for climate change suggests that building canals through the Back Bay neighborhood would help it withstand water levels that could rise as much as 7 feet by 2100. Some roads and public alleys, such as Clarendon Street, could be turned into narrow waterways, the report suggests, allowing the neighborhood to absorb the rising sea with clever engineering projects that double as public amenities.
  • Making better use of dice in games
    In 2005, Queen Games published Roma from German game designer Stefan Feld. In this two-player game, players assign actions to die faces, and can only activate those actions by spending a die of the matching value.

    Since then, Feld has embarked on a personal crusade to make dice more interesting.

    “I really like dice,” Feld said, but he wanted players to have control of the game. He didn’t want them to win or lose based on simple luck. In most classic games like Monopoly and Risk, that’s exactly what can happen, and often does.

Wednesday, October 1, 2014

This is what the Internet was made for

Really, if you don't thoroughly enjoy Derek Low's wonderful photo-essay: What It's like to Fly the $23,000 Singapore Airlines Suites Class, you're probably just not Right For The Internet.