Actually, that's not true. I went to the beach with the cousins on Saturday afternoon.
- Lessons Learned From Reading Post Mortems
One of the things I find to be curious about these failure modes is that when I talked about what I found with other folks, at least one person told me that each process issue I found was obvious. But these “obvious” things still cause a lot of failures. In one case, someone told me that what I was telling them was obvious at pretty much the same time their company was having a global outage of a multi-billion dollar service, caused by the exact thing we were talking about. Just because something is obvious doesn’t mean it’s being done.
- How does a relational database work
In this simple example, I end up with many possibilities. But a real query can have other relational operators like OUTER JOIN, CROSS JOIN, GROUP BY, ORDER BY, PROJECTION, UNION, INTERSECT, DISTINCT … which means even more possibilities.
So, how a database does it?
Dynamic programming, greedy algorithm and heuristic
A relational database tries the multiple approaches I’ve just said. The real job of an optimizer is to find a good solution on a limited amount of time.
Most of the time an optimizer doesn’t find the best solution but a “good” one.
For small queries, doing a brute force approach is possible. But there is a way to avoid unnecessary computations so that even medium queries can use the brute force approach. This is called dynamic programming.
- bcachefs - a general purpose COW filesystem
For those who haven't kept up with bcache, the bcache codebase has been evolving/metastasizing into a full blown, general purpose posix filesystem - a modern COW filesystem with checksumming, compression, multiple devices, caching, and eventually snapshots and all kinds of other nifty features.
- The Programmer's Guide to bcache
At a high level, bcache's btree is a copy on write b+ tree. The main difference between bcache's b+ tree and others is the nodes are very large (256k is typical) and log structured. Like other COW b+ trees, updating a node may require recursively rewriting every node up to the root; however, most updates (to both leaf nodes and interior nodes) can be done with only an append, until we've written to the full amount of space we originally reserved for the node.
- Cake Technical Information
Cake instead schedules packets based on time deficits. If no deficit exists when a packet is requested, it can be sent immediately. The transmit time of the following packet is then calculated, and until that time the shaper is placed in deficit mode. While in deficit mode, packets are scheduled using a watchdog timer whenever a request arrives too soon, and transmission times are calculated for a continuous packet train. This continues until the queue drains; if a packet is requested, but none are available and the next transmission time has been reached, the shaper returns to the quiescent state in which the next packet can be sent immediately.
Deficit mode makes the burst size dependent only on hardware and kernel latency (including timer resolution), and minimises bursts without requiring manual tuning. Cake's shaper can therefore be set much closer to the actual link speed without jeopardising latency performance. Modern hardware can achieve sub-millisecond bursts in most cases.
- The impact of fast networks on graph analytics, part 1
tl;dr: A recent NSDI paper argued that data analytics stacks don’t get much faster at tasks like PageRank when given better networking, but this is likely just a property of the stack they evaluated (Spark and GraphX) rather than generally true. A different framework (timely dataflow) goes 6x faster than GraphX on a 1G network, which improves by 3x to 15-17x faster than GraphX on a 10G network.
- Multi-million operations per second on a single Google Compute Engine instance
Often times technology vendors advertise scale-out as a way to achieve high performance. It is a proven approach, but it is often used to mask single node inefficiencies. Without a well balanced system where CPU, memory, network, and local storage are properly balanced, this is simply what we call “throwing hardware at the problem”. Hardware that, virtual or not, customers pay for.
To demonstrate this, we decided to check Helium’s performance on a single node on Google Cloud Platform with a workload similar to the one previously used to showcase Aerospike and Cassandra (200 byte objects and 100 million operations). With Cassandra, the data store contained 3 billion indices.
- The most timeless songs, measured using play counts on Spotify
Until recently, it was impossible to measure the popularity of older music. Billboard charts and album sales only tell us about a song’s popularity at the time of its release.
But now we have Spotify, a buffet of all of music, new and old. Tracks with fewer plays are fading into obscurity. And those with more plays are remaining in the cultural ether.
- Designing And Building Stockfighter, Our Programming Game
Why? Well, a lot of the fun engineering problems in trading are caused by it actually not being reliably the case that you send in an order and it gets unproblematically matched at “the price.” Markets are distributed systems. The exchange’s view of reality and your trading system’s view of reality are, by necessity, separated by the great firewall known as “physics.” For maximum possible results, you have to be able to do things like accurately predict what the future state of the exchange is, because the order you’re composing right now will arrive in the future not the present, while being cognizant that your present view of the exchange’s state is actually the exchange’s past.