Monday, September 29, 2014

A fairly random collection of links on Merkle Trees

Just sitting around, hanging out, musing about Merkle Trees...

  • Merkle tree
    Hash trees can be used to verify any kind of data stored, handled and transferred in and between computers. Currently the main use of hash trees is to make sure that data blocks received from other peers in a peer-to-peer network are received undamaged and unaltered, and even to check that the other peers do not lie and send fake blocks.
  • A Certified Digital Signature
    The method is called tree authentication because the computation of H(1,n,Y) forms a binary tree of recursive calls. Authenticating a particular leaf Y(i) in the tree requires only those values of H() starting from the leaf and progressing to the root, i.e., from H(i,i,Y) to H(1,n,Y).
  • Recent Improvements in the Efficient Use of Merkle Trees: Additional Options for the Long Term
    Fractal Merkle Tree Representation and Traversal, shows how to modify Merkle’s scheduling algorithm to achieve a space-time trade-off. This paper was presented at the Cryptographer’s Track, RSA Conference 2003 (May 2003). This construction roughly speeds up the signing operation inherent in Merkle’s algorithm by an arbitrary factor of T, (less than H), at a cost of requiring more space: (2^T times the space).
  • Merkle Signature Schemes, Merkle Trees and Their Cryptanalysis
    The big advantage of the Merkle Signature Scheme is, that the security does not rely on the difficulty of any mathematic problem. The security of the Merkle Signature Scheme depends on the availability of a secure hash function and a secure one-time digital signature. Even if a one-time signature or a hash function becomes insecure, it can be easily exchanged. This makes it very likely that the Merkle Signature Scheme stays secure even if the conventional signature schemes become insecure.
  • Caches and Merkle Trees for Efficient Memory Authentication
    Our work addresses the issues in implementing hash tree machinery in hardware and integrating this machinery with an on-chip cache to reduce the log N memory bandwidth overhead.
  • Protocol specification
    Merkle trees are binary trees of hashes. Merkle trees in bitcoin use a double SHA-256, the SHA-256 hash of the SHA-256 hash of something.

    If, when forming a row in the tree (other than the root of the tree), it would have an odd number of elements, the final double-hash is duplicated to ensure that the row has an even number of hashes.

    First form the bottom row of the tree with the ordered double-SHA-256 hashes of the byte streams of the transactions in the block.

    Then the row above it consists of half that number of hashes. Each entry is the double-SHA-256 of the 64-byte concatenation of the corresponding two hashes below it in the tree.

    This procedure repeats recursively until we reach a row consisting of just a single double-hash. This is the Merkle root of the tree.

  • Amazon's Dynamo
    Merkle trees help in reducing the amount of data that needs to be transferred while checking for inconsistencies among replicas. For instance, if the hash values of the root of two trees are equal, then the values of the leaf nodes in the tree are equal and the nodes require no synchronization. If not, it implies that the values of some replicas are different. In such cases, the nodes may exchange the hash values of children and the process continues until it reaches the leaves of the trees, at which point the hosts can identify the keys that are “out of sync”. Merkle trees minimize the amount of data that needs to be transferred for synchronization and reduce the number of disk reads performed during the anti-entropy process.
  • Cassandra: Using Merkle trees to detect inconsistencies in data
    A repair coordinator node requests Merkle tree from each replica for a specific token range to compare them. Each replica builds a Merkle tree by scanning the data stored locally in the requested token range. The repair coordinator node compares the Merkle trees and finds all the sub token ranges that differ between the replicas and repairs data in those ranges.
  • The Dangers of Rebasing A Branch
    Personally, having studied Merkle Trees and discussed a possible use-case for using git/Merkle Trees as a caching solution, I view git as a entirely immutable structure of your code. Rebases break this immutability of commits.
  • Sigh. "grow-only", "rebase is dangerous", "detached head state is dangerous". STOP. Stop it now.
    git is a bag of commits organized into a tree (a tree of Merkle hash chains). Branches and tags are symbolic names for these. Think of it this way and there's no danger.

    ...

    I didn't need a local branch crutch to find my way around because I know the model: a tree of commits.

    Understanding the model is the key.

    There are other VCSes that also use Merkle hash trees. Internally they have the power that git has.

  • Google's end-to-end key distribution proposal
    Smells like a mixed blockchain/git type approach - which is a good thing. The "super-compressed" version of the log tip sounds like git revision hash. The append-only, globally distributed log is pretty much like a blockchain.
  • Ticking time bomb
    Given only one verified hash in such a system, no part of the data, nor its history of mutation can be forged. "History" can mean which software runs on your computer (TPM), which transactions are valid (Bitcoin), or which commits have been done in a SCM (git, mercurial).

    So git is not magical, it is just a practical implementation of something that works. Any other *general* solution will be based on similar basic principles. Mercurial does this and there is a GPG extension for it.

Wow, that's a lot.

I shall have to find more time to read...

Saturday, September 27, 2014

What I'm reading, Aurora ATC edition

Amazingly, some of our relatives in Illinois, even those living in Downers Grove, hadn't heard about the events in Aurora. I think it's one of those bizarre events that affected a handful of people very significantly, some of those people many thousands of miles away from the incident, but didn't affect others even one whit.

Anyway...

  • FAA contractor charged with fire that halted Chicago flights
    "This is a nightmare scenario when we thought systems were in place to prevent it," said aviation analyst Joseph Schwieterman of DePaul University in Chicago. "Technology is advancing so fast that ... there's less of a need for air traffic control to be so geographically oriented. I think the FAA's going to find itself under a microscope."
  • Everything you need to know about the Shellshock Bash bug
    The risk centres around the ability to arbitrarily define environment variables within a Bash shell which specify a function definition. The trouble begins when Bash continues to process shell commands after the function definition resulting in what we’d classify as a “code injection attack”.
  • MySQL 5.7.5: GROUP BY respects functional dependencies!
    Most RDBMS-es implement the SQL92 behavior, and generate an error in case a non-aggregate expression appears in the SELECT-list but not the GROUP BY-clause. Because MySQL would not generate an error at all and instead would simply allow such queries while silently producing a non-deterministic result for such expressions, many users got bitten and frustrated.
  • Scaling NoSQL databases: 5 tips for increasing performance
    Virtualization can be great. It provides the flexibility to use a single machine for multiple purposes, with reasonable security for non-critical business data. Unfortunately, it also influences memory access speed, which is critical for some NoSQL databases. Depending on the hypervisor and the underlying hardware capabilities, it can add 20-200% penalty in accessing memory for NoSQL workloads. I have witnessed this in testing, and it is also documented by a number of industry benchmarks. Newer generation hardware helps with better support for hardware assisted memory management unit (MMU), but there is still a significant impact for workloads that generate a lot of Translation Lookaside Buffer (TLB) misses (as NoSQL databases are wont to do).
  • Why Scrum Should Basically Just Die In A Fire
    Of course, musing, considering, mulling things over, and coming to realizations all constitute a significant amount of the actual work in programming. It is impossible to track whether these realizations occur in the office or in the shower. Anecdotally, it's usually the shower. Story points, meanwhile, are completely made-up numbers designed to capture off-the-cuff estimates of relative difficulty. Developers are explicitly encouraged to think of story points as non-binding numbers, yet velocity turns those non-binding estimates into a number they can be held accountable for, and which managers often treat as a synonym for productivity. "Agile" software exists to track velocity, as if it were a meaningful metric, and to compare the relative velocity of different teams within the same organization.

    This is an actual thing which sober adults do, on purpose, for a living.

    "Velocity" is really too stupid to examine in much further detail, because it very obviously disregards this whole notion of "working software as a measure of progress" in favor of completely unreliable numbers based on almost nothing.

    ...

    Sacrificing "working software as a measure of progress" to meaningless numbers that your MBAs can track for no good reason is a pretty serious flaw in Scrum. It implies that Scrum's loyalty is not to the Agile Manifesto, nor to working software, nor high-quality software, nor even the success of the overall team or organization. Scrum's loyalty, at least as it pertains to this design decision, is to MBAs who want to point at numbers on a chart, whether those numbers mean anything or not.

  • Relativistic hash tables, part 1: Algorithms
    One might wonder whether the resizing of hash tables is common enough to be worth optimizing. As it turns out, picking the correct size for a hash table is not easy; the kernel has many tables whose size is determined at system initialization time with a combination of heuristics and simple guesswork. But even if the initial guess is perfect, workloads can vary over time. A table that was optimally sized may, after a change, end up too small (and thus perform badly) or too big (wasting memory). Resizing the table would fix these problems, but, since that is hard to do without blocking access to the table, it tends not to happen. The longer-term performance gains are just not seen to be worth the short-term latency caused by shutting down access to the table while it is resized.
  • How to Squeeze a Huge Ship Down a Tiny River
    Everything aligned on Monday, and the six captains set to work in the afternoon. Given the intense concentration needed to do the job, they worked in pairs for 90 minute shifts. One captain steered the bow, the other guided the stern. The unusual maneuvering system helps a ship this big precisely navigate tight turns and narrow squeezes, much like a tiller driver helps the driver of a hook-and-ladder firetruck navigate city streets.

    Spectators lining the river could be forgiven for thinking Quantum was headed upriver, given that it went downriver backward. Using the propellers to pull from the front offers better control than pushing from the back (the same is true for front wheel-drive cars). Tug boats, attached directly, rather than by a cable, to the bow and stern of the ship, provided extra control.

  • File systems, Data Loss and ZFS
    ZFS is able to protect all of these things between the disks and system memory, with each change protected after an IO operation (e.g. a write) returns from the kernel. This protection is enabled by ZFS’ disk format, which places all data into a Merkle tree that stores 256-bit checksums and is changed atomically via a two-stage transaction commit. No other major filesystem provides such guarantees.

    However, that is not to say that ZFS cannot lose data.

  • Another Patent Troll Slain. You Are Now Free To Rotate Your Smartphone.
    Rotatable owned a patent that it claimed covers the screen rotation technology that comes standard in just about every smartphone. You know, when you flip your device sideways and the screen shifts orientation from portrait mode to landscape mode? Like nearly all the apps in the Apple and Android app stores, Rackspace uses standard functionality provided by Apple’s libraries and Android open source software to provide this display feature in our mobile cloud applications.
  • The Dropbox terabyte conundrum
    When you move items to Transporter Library, they are moved off of your device and into the Transporter’s personal cloud. You are offloading files from your system, for hosting elsewhere. Accessing them is slow, of course, because they’re not actually on your device. But they’re also not taking up space on that tiny solid-state drive of yours.

    It’s a clever approach, and one that I hope Dropbox adopts—but I’m a little concerned that Dropbox is so committed to its metaphor that it won’t want to complicate it like this. Allowing direct disk-like access to Dropbox is very different than syncing files, so it might require major changes to Dropbox’s infrastructure. I can see how it might not be a development priority.

  • Liverpool beat Middlesbrough after 30-penalty Capital One Cup shoot-out
    It was the longest penalty shoot-out in the history of the League Cup, the previous record set at 9-8 on three occasions, and more extensive than the FA Cup’s highest total when Macclesfield beat Forest Green 11-10 in 2001. Of major English competitions only the Football League Trophy can equal it, also boasting a 14-13 shoot-out.

In other news, I made Hacker News!

For an essay I wrote last January.

But it was cool nonetheless.

And thanks, Kale, for including me in your newsletter!

Friday, September 26, 2014

Well, that was a bust.

Surely we weren't the only people affected by today's air traffic control snafu.

Our 6 day trip to Michigan won't happen, sadly; we declined the suggestion that we fly to "Baltimore or maybe Saint Louis, and then drive from there."

Instead, we'll plan another trip, perhaps this winter (we're already planning to go for Thanksgiving, too.)

The world can throw you a curveball, you just have to deal with it.

Thursday, September 25, 2014

Rain! Actual rain!

OK, it was only for a few hours.

But it was actual rain!

Wow that felt good.

Sunday, September 21, 2014

What I'm reading, autumnal equinox edition

Confucius says: he who attends all-week company conference while battling nasty head cold sleeps 10 hours / night over next weekend.

At any rate, I'm feeling somewhat better now, and am back to surfing the nets.

  • Guide: Writing Testable Code
    To keep our code at Google in the best possible shape we provided our software engineers with these constant reminders. Now, we are happy to share them with the world.
  • Why DO Computers Fail?
    The obvious question when reading a text this old is just how relevant the information is for the world of computing today. I see no reason to think that the fundamental principles have changed, even though all numbers in the report are way off compared to current technology. For example, in a description of a restart scenario, he cites a time of about 90 minutes from start to a live system. Today, most systems would come up much faster than that. The nature of networking has changed dramatically from 1985 in terms of speed and latencies and robustness. Even so, it is still true that communications links are normally the weakest link in any distributed system.
  • High Performance SSH/SCP - HPN-SSH
    SCP and the underlying SSH2 protocol implementation in OpenSSH is network performance limited by statically defined internal flow control buffers. These buffers often end up acting as a bottleneck for network throughput of SCP, especially on long and high bandwith network links. Modifying the ssh code to allow the buffers to be defined at run time eliminates this bottleneck. We have created a patch that will remove the bottlenecks in OpenSSH and is fully interoperable with other servers and clients. In addition HPN clients will be able to download faster from non HPN servers, and HPN servers will be able to receive uploads faster from non HPN clients. However, the host receiving the data must have a properly tuned TCP/IP stack. Please refer to this tuning page for more information.
  • Memory Management Reference
    This is a resource for programmers and computer scientists interested in memory management and garbage collection.
  • Resource management in Docker
    Docker uses cgroups to group processes running in the container. This allows you to manage the resources of a group of processes, which is very valuable, as you can imagine.
  • What is linux-gate.so.1?
    From time to time this is a cause of befuddlement and frustration for users as they go searching for a non-existent system file. You can confidently tell users on this futile quest that there's not supposed to be a linux-gate.so.1 file present anywhere on the file system; it's a virtual DSO, a shared object exposed by the kernel at a fixed address in every process' memory
  • I’m leaving Mojang
    I was at home with a bad cold a couple of weeks ago when the internet exploded with hate against me over some kind of EULA situation that I had nothing to do with. I was confused. I didn’t understand. I tweeted this in frustration. Later on, I watched the This is Phil Fish video on YouTube and started to realize I didn’t have the connection to my fans I thought I had. I’ve become a symbol. I don’t want to be a symbol, responsible for something huge that I don’t understand, that I don’t want to work on, that keeps coming back to me. I’m not an entrepreneur. I’m not a CEO. I’m a nerdy computer programmer who likes to have opinions on Twitter.
  • Why Free Marketeers Want To Regulate the Internet
    it seems odd for a conservative – whether an old-guard big-business Bush-era conservative or a new-guard Paulite libertarian conservative – to support Net Neutrality.

    Except I do Internet for a living, and I am one of the lucky ones who actually knows what Net Neutrality means and what it’s responding to. And, folks, I’m afraid that, while L. Gordon Crovitz and Rich Lowry are great pundits with a clear understanding of how Washington and the economy work, they don’t seem to understand how the Internet works, which has led them to some wrong conclusions.

  • Eleventh Grade Tech Trends
    So while the anecdotes of a sixteen year old, public high school student, from Los Angeles are not representative, and will not help you find the next Facebook, I do think the practice of pausing, asking questions, and listening to other people is a good one that should be practiced much more.
  • How Gangs Took Over Prisons
    This past summer, however, a 32-year-old academic named David Skarbek published The Social Order of the Underworld, his first book, which is the best attempt in a long while to explain the intricate organizational systems that make the gangs so formidable. His focus is the California prison system, which houses the second-largest inmate population in the country
  • Here Comes Habitat
    The MADE, a few months ago, quietly began working on reviving the game Habitat for preservation. The period we have thus far completed has mostly been focused on gathering human resources, hardware resources, and assessing the extent to which we can preserve the first Massively Multiplayer game. At this point, we are ready to announce that we feel we have a very good chance of bringing Habitat online, in its original form, for play online with Commodore 64 emulators as the client.
  • iOS 8, thoroughly reviewed
    In this review, we'll be talking mostly about features available to all iOS 8 devices, and to hardware that you already have in your hands right now. Several software features in the new operating system are exclusive to the new iPhone 6 and iPhone 6 Plus, and discussion of those features will wait until we review those devices.
  • What every computer programmer should know about floating point, part 1
    There is an existing article called What Every Computer Scientist Should Know About Floating-Point Arithmetic, but it is very math-heavy and focuses on subtle issues that face data scientists and CPU designers. This article ("What every computer programmer should know...) is aimed at the general population of programmers. I'm focusing on simple and practical results that you can use to build your intuition for how to think about floating-point numbers.
  • Dread Pirate Sunk By Leaky CAPTCHA
    The IP address leak we discovered came from the Silk Road user login interface. Upon examining the individual packets of data being sent back from the website, we noticed that the headers of some of the packets reflected a certain IP address not associated with any known Tor node as the source of the packets. This IP address (the “Subject IP Address”) was the only non-Tor source IP address reflected in the traffic we examined
  • NOAA team reveals forgotten ghost ships off Golden Gate
    A team of NOAA researchers today confirmed the discovery just outside San Francisco’s Golden Gate strait of the 1910 shipwreck SS Selja and an unidentified early steam tugboat wreck tagged the “mystery wreck.” The researchers also located the 1863 wreck of the clipper ship Noonday, currently obscured by mud and silt on the ocean floor.

    These and other shipwreck investigations mark the first mission of a two-year project to locate, identify and better understand some of the estimated 300 wrecks in Gulf of the Farallones National Marine Sanctuary, and the adjacent Golden Gate National Recreation Area.

Oh, and yes, the rumors are true: I now use an iPhone.

Monday, September 15, 2014

MRAP-a-tat-tat

What.

Just what.

The world is insane.

  • Bay Area police departments got millions in military surplus, records show
    Asked why Antioch police needed a mine-resistant personnel carrier weighing more than 30,000 pounds, Antioch police Captain Leonard Orman said the vehicle was critical to the department's ability to protect officers during a natural disaster or in incidents that require a SWAT team.

    "It's a defensive vehicle that provides the ability to be protected from gunfire, including high-powered rifles," Orman said. "If someone is barricaded in a home and there is an injured person on the ground, we can use it to rescue the person without exposing ourselves to fire."

    ...

    UC Berkeley's police department used the 1033 program to request about a dozen M-16 rifles, which it said would give officers firepower equal to that held by some of the criminals they encounter, said Lt. Eric Tejada, a department spokesman.

    "We feel that those specialists need to have a rifle that's capable of dealing with some incidents that can involve the modern-day weapons that you see now," Tejada said. "It's smart for us to utilize the resources that you can get for free."

  • Davis acquires mine-resistant war vehicle while some complain of militarization of police
    Chief Landy Black of the Davis police defended the acquisition of the MRAP, saying in a statement that its heavy armor “makes it the perfect platform to perform rescues of victims and potential victims during … active-shooter incidents, and to more safely deliver officers into an active-shooter incident.”

    ...

    “I can’t imagine why Davis needs a tank,” Davis Mayor Dan Wolk said Wednesday. “It’s in a city garage and I hope it stays there.”

  • Dozens of police departments suspended for losing US military-grade weaponry
    According to the media outlet Fusion, its independent investigation into the Pentagon’s “1033 program,” which equips state and local police departments across the US with excess military equipment, turned up an alarming trend: Not only did many law enforcement agencies fail to comply with the program’s guidelines, they routinely lost dangerous weaponry.

    Already, the investigation has found that police departments in Arizona, California, Mississippi, Missouri, Georgia, and others have lost or cannot account for various types of weapons. This list includes M14 and M16 assault rifles, .45-caliber pistols, shotguns, and even vehicles.

  • The Pentagon Is Giving Grenade Launchers To Campus Police
    David Perry, the president of the International Association of Campus Law Enforcement Administrators, told Politico that 1033 mostly funnels “small items” to college police forces for daily use. These could be anything from office supplies or uniforms or car parts, but it’s probably not all that tame. Campus Safety magazine recommends that universities take part in the 1033 program to cover a range of needs from storage units to grenade launchers. That is, after all, what the program was designed to achieve.
  • Finding Funds for Your Equipment, Programs and People (Part 2 of 2)
    Military surplus and contractors via the 1033 program, however, can be excellent sources of used equipment.

    The 1033 program (formerly the 1208 program) permits the secretary of defense to transfer excess U.S. Department of Defense personal property (supplies and equipment) to state and local law enforcement agencies. Anything from used grenade launchers (for the deployment of less lethal weapons) to trucks to boats to storage units may be available for a significantly reduced cost.

  • Ferguson aftermath: California city tells cops to get rid of armored vehicle
    Davis Police Chief Landry Black made the case for keeping the MRAP, saying the police department had confiscated much high-power weaponry in the last year. He said there were specific guidelines for its use, and that it is a necessary piece of safety equipment for the city.
  • Police Armored Vehicle Is Unwelcome in California College Town
    Sheriff Brown of Santa Barbara County said there had been “a lot of misunderstanding about the program — in some quarters, even hysteria.”

    “The reality is that this is a great program,” he said. “It provides law enforcement with a lot of very valuable equipment that in many instances — in fact, most instances — could not be obtained or afforded, and allows us to do a better job of protecting our citizens and our own public safety personnel.”

  • Commentary: A militarized police force may see its citizens as the ‘enemy’
    Even college security forces are getting their share: A sidebar noted that nine out of 10 universities employ armed officers authorized to use deadly force.

    And in 2013, “the campus police at The Ohio State University procured a Mine-Resistant Ambush-Protected vehicle (MRAP), according to the Daily Caller website. The vehicle, which school officials noted was ‘acquired at no cost from military surplus,’ has a gun turret on the roof and is designed to stave off ambushes and roll over improvised explosive devices. OSU was also the first agency in the state to acquire an MRAP at the time.

"Campus Safety Magazine"? There is such a thing?

My head hurts. I am sad.

Saturday, September 13, 2014

Trinity: a very short review

Leon Uris's Trinity is generally considered to be one of the author's best works. For thematic reasons, it was one of my vacation readings.

Trinity is 900 pages long. The first 600 pages are superb, and just flew by. This is the portion of the book which covers the period, roughly, from the Great Famine through the Industrial Revolution, say, late 1830's through late 1880's. The characters were compelling, the storytelling was both exciting and colorful, and the book managed to be somehow intimate and sweepingly epic at once.

The genius of this part of this book is the way that Uris helps you understand why people behave the way they do, not by explaining it to you, but by showing you how it happens. People don't just wake up one day and do something dreadful to each other, it happens over a period of time, through untold zillions of tiny step by step actions and individual decisions, each of which seems simple and obvious and inevitable but they all add up.

The last 300 pages, covering roughly the first quarter of the 20th century, just didn't work for me.