I want more, more, more!.
Monday, April 30, 2012
I appear to have successfully upgraded to Ubuntu 12.04. Nothing special to report, so far, all has been fine.
I've got at least 3 other machines to upgrade, so I may have more experiences to report later.
But, for now, it's onward and upward with Linux.
Friday, April 27, 2012
Johan Peitz's SuperMario Summary game is more interesting as art than as a game, but it's also an amazingly impressive game for a 48-hour development cycle.
My favorite part, though, is the post-mortem he wrote up for the contest site about his experiences.
Thursday, April 26, 2012
I'm not really sure where is the best place to try to follow the trial. There is a lot of coverage, but in some ways it's surprising how little information is being released, as Paul Thurrott points out:
It's a case in which Google has been charged with stealing the intellectual property of Java, which is now owned by Oracle. A case that has incriminating email. A case in which Google has clearly done exactly as charged. And a case in which Google CEO Larry Page appeared on the stand in court. And actually, you’ll still pretty much have to imagine it. Because even though it really happened, it’s somehow not the biggest tech story on Earth this week. Inconceivable!
There are a few sites, however, which seem to be doing a pretty good job of covering the trial. Wired, for example, has been featuring a number of articles.
And Florian Mueller has been covering the trial on his FOSS Patents blog.
Even just given these few sources, it's interesting how hard it is to make sense of the trial. Just consider:
- Open-sourcing of Java and API copyrightability are entirely unrelated issues
Sun's publication of certain Java software under an open source license has nothing -- absolutely nothing -- to do with the question of whether the structure, sequence and organizations of API packages is protected by copyright law.
There's no way that Google's use of the structure, sequence and organization of the asserted API packages can be justified by claiming that Google simply chose to benefit from the availability of Java software published by Oracle/Sun on free and open licensing terms.
- Ex-Sun Boss Defends Google’s Right To Java on Android
Taking the stand during the ongoing court battle between Google and Oracle over the use of the Java programming language on Google’s Android mobile operating system, Jonathan Schwartz — the former CEO of Sun Microsystems, the creator of Java — said that Java has always been free to use and that although Sun didn’t necessarily like the way Android used Java, it had no intention of stopping it.
Schwartz acknowledged that he didn’t like what Google was doing at the time. But he maintained that Google was free to do this.
It's a good thing I wasn't on the jury for this case, as I can't imagine making heads or tails of situations like this.
Wednesday, April 25, 2012
Here is a bit of oddly fascinating art: Descriptive Camera, by Matt Richardson.
It lives somewhere in the swamp between technology, pop culture, and art. I think it's actually quite clever, although I suspect it doesn't have much lasting value. It captures a moment in time (i.e., 2012).
I like the way it inverts the relationship between humans and computers. The "gadget" (the camera) emits something that looks like it is computational (the textual description), yet that is pretty much the only part that is human-generated (via the Mechanical Turk).
What is data, what is meta-data, what is mechanical, and what is creative? It's a fun exploration of the subject.
Of course, speaking of artists, Randall Munroe is thinking about this, too.
I love this breathless re-telling of the story of Jordan Mechner recovering the original Apple II version of the Prince of Persia source from ancient archived floppies: The Geeks Who Saved Prince of Persia’s Source Code From Digital Death.
Having safely obtained the Quadris data, Tony Diaz hands the disk off to Jason Scott. Whereas some might be happy just to have played a long-lost videogame, Scott has posterity to think of. A full magnetic read of the floppy will guarantee that there’s nothing else on there, no deleted data from another lost project.
“We’re like Indiana Jones,” Mechner says. “We just ran into the tomb, grabbed the idol and now we’re running out. We got the golden idol!”
Mechner keeps handing disks to Diaz, and more golden idols keep spilling out. Like many computer kids of his day, Mechner used old floppies and rewrote them with new data. One of them used to hold part of Roberta Williams’ early Sierra On-Line adventure game Time Zone. Now, Mechner says, it holds the only copy of one of his earliest games.
I remember those days. Only, for me, those old floppies were the Fred Fish disks, and it was an Amiga, not an Apple II.
Tuesday, April 24, 2012
Monday, April 23, 2012
Everybody knows I dig Valve.
After all, for 6 months I couldn't talk about anything but the game.
But they're a quite interesting place for a number of reasons, and recently they've been in the news for several of those reasons.
Firstly, there's this great blog entry by the ultra-fascinating Michael Abrash: Valve: How I Got Here, What It’s Like, and What I’m Doing. Of course, I like that he gave a nice plug to my day job (we're wicked pleased to see our tools in use!), but there was much else to absorb in Abrash's fine essay:
If most of the value is now in the initial creative act, there’s little benefit to traditional hierarchical organization that’s designed to deliver the same thing over and over, making only incremental changes over time. What matters is being first and bootstrapping your product into a positive feedback spiral with a constant stream of creative innovation. Hierarchical management doesn’t help with that, because it bottlenecks innovation through the people at the top of the hierarchy, and there’s no reason to expect that those people would be particularly creative about coming up with new products that are dramatically different from existing ones – quite the opposite, in fact. So Valve was designed as a company that would attract the sort of people capable of taking the initial creative step, leave them free to do creative work, and make them want to stay.
Secondly, somebody somewhere apparently got hold of the Valve employee handbook and posted it on the net: Valve's 'Handbook for New Employees' leaked, hilarious illustrations included.
This book isn’t about fringe benefits or how to set up your workstation or where to find source code. Valve works in ways that might seem counterintuitive at first. This handbook is about the choices you’re going to be making and how to think about them. Mainly, it’s about how not to freak out now that you’re here.
Lastly, as the fine folks at Techdirt have been pointing out, Valve has been experimenting with some very innovative and intriguing ways to connect with their customers: Valve Tries To Charge People Based On How Likable They Are: Trolls Pay Full Price. As Mike Masnick points out, just thinking about what this might mean is fascinating:
I think there are lots of community-based properties would love to be able to charge trolls more. However, this could be really, really difficult to work in practice, and create some problems, depending on what the overall goals are. It would be nice, of course, if you could come up with a perfect system to get rid of trolls, but distinguishing true trolls can often be much more difficult in practice than in theory.
Way to go Valve, you continue to be an interesting place, which is a great thing in these Internet times.
Sunday, April 22, 2012
Depending on who you listen to, or sometimes just on what time of day you listen to them, Software-Defined Networking is either the biggest thing to hit the computing industry in years, or just the latest trendy fad.
Either way, it's clearly generating a lot of activity right now, culminating in last week's Open Networking Summit, ONS 2012.
A lot of the coverage has depicted the new technology as still in its infancy, but Google's Urs Hoezle suggested at the summit that OpenFlow networking is much more mature than you might think:
In a separate conversation after the keynote, Holzle said Google does not expect to buy OpenFlow systems this year as it focuses on finishing the implementation of its current G-Scale network. However he opened the door to purchases in 2013 and beyond, probably looking for 40G systems supporting as many as 1,000 ports.
If there's just way too much to read about these topics right now, here are a couple nice summaries, that can help guide your way:
Even if you're not in the networking world specifically (and I'm generally not), it's worth trying to keep up with what's going on, and there's plenty here to keep you busy.
I'm taken by the clever website http://www.moviemimic.com.
I'm not sure who the fellow is behind it, but he's got a very clever idea: as he travels around the world, he takes the opportunity to have his picture taken, posing in a scene from a movie he enjoys.
As he puts it on the front page of the site:
This is how I combine my loves of film and travel. I call it movie mimicking.
It's just a simple idea, well-executed, that leads to some creative and fun artwork.
Saturday, April 21, 2012
On our walks with the dog, my wife thinks that I spend rather much time obsessing about the details of the various waterbirds that frequent the shoreline.
She thinks that I am overly concerned about the classification of the different species into finer sub-groupings.
That's right, she says that I'm splitting herons.
Thursday, April 19, 2012
I'm quite happy to see that Perforce have released version 2012.1 of the core product. Here's more information.
I was privileged to be able to contribute to several substantial areas of this release, and I'm really looking forward to seeing it used. I think Perforce users will find all sorts of new and interesting ways to take advantage of the new features.
Interested? Download it and give it a try!
More random things I found interesting recently:
- Hawley Channels His Inner Schneier
This is the fundamental political problem of airport security: it's in nobody's self-interest to take a stand for what might appear to be reduced security. Imagine that the TSA management announces a new rule that box cutters are now okay, and that they respond to critics by explaining that the current risks to airplanes don't warrant prohibiting them. Even if they're right, they're open to attacks from political opponents that they're not taking terrorism seriously enough. And if they're wrong, their careers are over.
- Valve: How I Got Here, What It’s Like, and What I’m Doing
Valve was designed as a company that would attract the sort of people capable of taking the initial creative step, leave them free to do creative work, and make them want to stay. Consequently, Valve has no formal management or hierarchy at all.
- Battle for the internet
The Guardian is taking stock of the new battlegrounds for the internet. From states stifling dissent to the new cyberwar front line, we look at the challenges facing the dream of an open internet.
- Cybercrime And Bad Statistics
The problem is that the cybercrime estimates are based on surveys of individuals, surveys that ask essentially “what did cybercrime cost you this year?” There may be other ways to get this information, but companies are unlikely to answer questions about cybercrime. Hence even the Federal Trade Commission (FTC) must rely on surveys.
- The Lost Steve Jobs Tapes
The lessons are powerful: Jobs matured as a manager and a boss; learned how to make the most of partnerships; found a way to turn his native stubbornness into a productive perseverance. He became a corporate architect, coming to appreciate the scaffolding of a business just as much as the skeletons of real buildings, which always fascinated him. He mastered the art of negotiation by immersing himself in Hollywood, and learned how to successfully manage creative talent, namely the artists at Pixar. Perhaps most important, he developed an astonishing adaptability that was critical to the hit-after-hit-after-hit climb of Apple's last decade. All this, during a time many remember as his most disappointing.
- Fark's Drew Curtis Explains How To Beat A Patent Troll (And Live To Tell The Tale)
The key lesson: don't negotiate with terrorists. As he points out, patent trolls have cost the US economy significantly more than terrorist attacks.
- Oracle v. Google trial: evidence of willful infringement outweighs claims of approved use
Presumably the parties wanted to show the best evidence right at the start, hoping to shape the way jurors are going to look at the tons of information they will receive in the coming weeks. Google's lawyers undoubtedly made the most out of the evidence they found in favor of their equitable defenses, but there is only so much that presentation can do when substance is lacking.
- Memeorandum Colors 2012: Visualizing Bias on Political Blogs
This automated analysis is not a commentary on the personal opinions and beliefs of any blogger -- no amount of linear algebra can prove that. What this shows is the biases in their linking behavior: the stories that each site chooses to cover, or not cover, and their similarity to others like them.
Read anything interesting lately? Let me know!
Tuesday, April 17, 2012
Wired magazine has coverage of the early events:
Monday, April 16, 2012
You're not just sitting around doing nothing, right? You're reading all these fascinating documents on the Internet, aren't you?
- Last week was the EuroSys 2012 conference in Bern, Switzerland. Happily, the folks at the University of Cambridge Computer Laboratory's Systems Research Group put together some great summaries of the research that was presented:
- Day One
There is also some stuff to do with constrained execution, to avoid variable-length instruction sequences being misinterpreted and doing dangerous things (I don’t fully understand why this is relevant, but I missed a bit of the talk being distracted).
- Day Two
To reduce DRAM latency, they constructed a lock-free, unbalanced 4-way tree with the same concurrency properties as a binary tree, but is only half as deep for the same amount of data -- plus each node fits into a cache line.
- Day Three
Of course, the migration business is somewhat challenging with regards to atomicity and consistency. They address this by moving to a fine-grainedly locked (per row) hash table for requests, which is only 2% slower than the coarse-grained version in stock Linux.
- Day One
- Phrack 68 is out, and XORL, a blogger from Greece, gives his usual great synopsis of the contents:
We all know that such rootkits are backdoring Androids in the wild for quite sometime and h0h0 has even made a presentation on it at DefCon in 2010, but it is always good to have some technical documentation to get started with.
- Scott Hanselman writes a stellar article on Facebook's complex privacy settings: Facebook's privacy settings are too complex for ANYONE to use - Change these settings today
Under tagging you can choose what happens when someone tags you and tags that friends add to your own posts or photos. You can also control tag suggestions.
- Kenneth Iverson's book on the APL programming language: A Programming Language, has reached its 50th anniversary, and the book is online.
The systematic treatment of complex algorithms requires a suitable programming language for their description, and such a programming language should be concise, precise, consistent over a wide area of application, mnemonic, and economical of symbols; it should exhibit clearly the constraints on the sequence in which operations are performed; and it should permit the description of a process to be independent of the particular representation chosen for the data.
Sunday, April 15, 2012
But that's not the sort of "security games" that I've been thinking about lately.
We're solidly into the 5th module of Dan Boneh's excellent online cryptography course, which means that, by some measure, I'm about halfway through with the class. I'm starting to feel comfortable with the material presented in the class; more encouragingly, I'm starting to feel confident about broadening and deepening my studies in this area beyond the material in the class.
The online cryptography class is aiming for a substantial degree of rigor and precision. One of the topics that comes up routinely in the class involves proofs of various results.
The notion of "proof" in modern cryptography is somewhat complex. The types of cryptographic algorithms that we are discussing and studying are random, probabilistic algorithms, so the proofs have a lot to do with analysis of probability.
That is, we are often trying to make a rigorous and exact assessment of just how likely a particular event is.
For example, the event might be a collision in a hash algorithm, guessing a key in a symmetric encryption algorithm, predicting the next value of a random generator algorithm, etc.
The proof technique that Professor Boneh uses for analyzing these probability distributions and forming conclusions about the behavior of algorithms is structured around the notion of a "security game". One of the clearest descriptions of this proof technique can be found in Victor Shoup's paper: Sequences of games: a tool for taming complexity in security proofs. From the introduction:
Security for cryptograptic primitives is typically deﬁned as an attack game played between an adversary and some benign entity, which we call the challenger. Both adversary and challenger are probabilstic processes that communicate with each other, and so we can model the game as a probability space. Typically, the deﬁnition of security is tied to some particular event S. Security means that for every “eﬃcient” adversary, the probability that event S occurs is “very close to” some speciﬁed “target probabilty”: typically, either 0, 1/2, or the probability of some event T in some other game in which the same adversary is interacting with a diﬀerent challenger.
The popularization of this proof technique, as far as I can tell, is credited to Phillip Rogaway, and in particular to the paper he wrote with Joe Kilian: How to Protect DES Against Exhaustive Key Search (An Analysis of DESX). In the paper, the authors analyze the advantage that a security attacker might be able to gain against the DESX algorithm by constructing a series of games that model the attacks that the adversary might choose.
(As an unrelated aside, my good friend John Black, who is now Professor of Computer Science at the University of Colorado at Boulder, was one of Professor Rogaway's early students. Hi John!)
Although I was initially uncomfortable with the security games technique, Professor Boneh's use of the approach is very clear, and after several times seeing these proofs applied, I have become much more comfortable with how they work. I think that there is just a fundamental complexity to constructing the analysis of a probabilistic random algorithm, and I've come to feel that the security games approach for working with these algorithms is an excellent way to illustrate their properties.
If, like me, your computer science background was primarily in deterministic algorithms, and modern cryptography is one of your first exposures to random and probabilistic behaviors, hopefully following some of these links and reading Professor Shoup's tutorial on the games approach will help you get more comfortable with these algorithms and techniques.
Friday, April 13, 2012
I don't want you to be bored this weekend, so I thought I'd pass along some articles you might find interesting. If not, hopefully you still won't be bored this weekend :)
- Hell Phone: Is there any way to stop the scourge of text message spam?
But there’s also a possibility the problem will get much worse before it gets better. For a grim picture of the future, one has only to look to China, where unlimited text plans have been widely available much longer. By some estimates, a third of all text messages in China today are spam.
- Netflix Recommendations: Beyond the 5 stars (Part 1)
To put these algorithms to use, we had to work to overcome some limitations, for instance that they were built to handle 100 million ratings, instead of the more than 5 billion that we have, and that they were not built to adapt as members added more ratings. But once we overcame those challenges, we put the two algorithms into production, where they are still used as part of our recommendation engine.
- Why Netflix Never Implemented The Algorithm That Won The Netflix $1 Million Challenge
And, people tend to have a more... optimistic viewpoint of their future selves. That is, they may be willing to rent, say, an "artsy" movie that won't show up for a few days, feeling that they'll be in the mood to watch it a few days (weeks?) in the future, knowing they're not in the mood immediately. But when the choice is immediate, they deal with their present selves, and that choice can be quite different.
- Raise the Crime Rate
Statistics are notoriously slippery, but the figures that suggest that violence has been disappearing in the United States contain a blind spot so large that to cite them uncritically, as the major papers do, is to collude in an epic con. Uncounted in the official tallies are the hundreds of thousands of crimes that take place in the country’s prison system, a vast and growing residential network whose forsaken tenants increasingly bear the brunt of America’s propensity for anger and violence.
- Harms of Post-9/11 Airline Security
The humiliation, the dehumanisation and the privacy violations are also harms. That Mr Hawley dismisses these as mere “costs in convenience” demonstrates how out-of-touch the TSA is from the people it claims to be protecting.
- Instagram as an island economy
The situation of Instagram is that of an isolated island economy, separate from the outside world, being linked to the global economy. How do we figure out what it's worth to the global economy? How do you value a closed system?
- Facebook and Instagram: When Your Favorite App Sells Out
Then along comes Facebook, the great alien presence that just hovers over our cities, year after year, as we wait and fear. You turn on the television and there it is, right above the Empire State Building, humming.
- ACID in HBase
It is important to realize that this only works if transactions are committed strictly serially; otherwise an earlier uncommitted transaction could become visible when one that started later commits first. In HBase transaction are typically short, so this is not a problem.
HBase does exactly that: All transactions are committed serially.
- MySQL at Twitter
MySQL is the persistent storage technology behind most Twitter data: the interest graph, timelines, user data and the Tweets themselves.
Thursday, April 12, 2012
I never learned my Civil War history very well, but I did notice that this week was the 150th anniversary of the Battle of Pittsburg Landing.
The Wikipedia article sums the 3-day event up as follows:
The two-day battle of Shiloh, the costliest in American history up to that time, resulted in the defeat of the Confederate army and frustration of Johnston's plans to prevent the joining of the two Union armies in Tennessee. Union casualties were 13,047 (1,754 killed, 8,408 wounded, and 2,885 missing); Grant's army bore the brunt of the fighting over the two days, with casualties of 1,513 killed, 6,601 wounded, and 2,830 missing or captured. Confederate casualties were 10,699 (1,728 killed, 8,012 wounded, and 959 missing or captured). The dead included the Confederate army's commander, Albert Sidney Johnston; the highest ranking Union general killed was W. H. L. Wallace. Both sides were shocked at the carnage. None suspected that three more years of such bloodshed remained in the war and that eight larger and bloodier battles were yet to come.
On his wonderful Up and Down California blog, Tom Hilton relates how the event reached California.
It's hard to comprehend nearly 25,000 casualties in a 3 day pitched battle on the shore of the Tennessee river.
My first blog article on this blog was in April, 2009, for what that's worth.
It's moderately interesting why blogs come and go. Here's Dave Kellogg, announcing that he's done blogging.
I thought this comment from Kellogg was intriguing:
A mixed operating/blogger role made people uncomfortable. I’d often find myself in meetings where people weren’t sure whether to treat me as supplier, partner, candidate, co-worker, adviser, or journalist. The simple fact is that most bloggers are either professional journalists or analysts/consultants. The rare combination created intermittent problems for me along the way. Shutting down the blog will simplify my life in this regard.
Have we reached the point where "those who can, do; those who can't, blog"?
Wednesday, April 11, 2012
Tuesday, April 10, 2012
“When I use a word,” Humpty Dumpty said, in rather a scornful tone, “it means just what I choose it to mean—neither more nor less.” -- Lewis Carroll's Through the Looking Glass
I've recently been trying to understand more about these "NoSQL" systems, and how they work.
One interesting question is what they mean by "consistency". There is lots of talk about consistency, and eventual consistency, and the CAP theorem, and things like that.
And it's all very vague.
if you search online posts related to HBase and Cassandra comparisons, you will regularly find the HBase community explaining that they have chosen CP, while Cassandra has chosen AP – no doubt mindful of the fact that most developers need consistency (the C) at some level.
Indeed, HBase's own documentation says:
Strongly consistent reads/writes: HBase is not an "eventually consistent" DataStore. This makes it very suitable for tasks such as high-speed counter aggregation.
So I guess that the HBase development team is choosing to define "strongly consistent" as "not 'eventually consistent'". Which isn't very much of a definition, in my opinion.
If you search still more, you'll find more detailed information, such as this HBase page on ACID semantics, which admits that:
HBase is not an ACID compliant database.
and then proceeds to completely re-define the famous ACID properties that Jim Gray set forth nearly 35 years ago.
It's very instructive to compare the original relational database definitions of the ACID properties versus the HBase definitions.
First, here's the class relational DBMS definitions, from the above Wikipedia article:
Atomicity requires that each transaction is "all or nothing": if one part of the transaction fails, the entire transaction fails, and the database state is left unchanged. An atomic system must guarantee atomicity in each and every situation, including power failures, errors, and crashes.
The consistency property ensures that any transaction will bring the database from one valid state to another. Any data written to the database must be valid according to all defined rules, including but not limited to constraints, cascades, triggers, and any combination thereof.
Isolation refers to the requirement that no transaction should be able to interfere with another transaction. One way of achieving this is to ensure that no transactions that affect the same rows can run concurrently, since their sequence, and hence the outcome, might be unpredictable. This property of ACID is often partly relaxed due to the huge speed decrease this type of concurrency management entails.
Durability means that once a transaction has been committed, it will remain so, even in the event of power loss, crashes, or errors. In a relational database, for instance, once a group of SQL statements execute, the results need to be stored permanently. If the database crashes immediately thereafter, it should be possible to restore the database to the state after the last transaction committed.
Now, here's the HBase definitions, from the HBase ACID semantics page:
For the sake of common vocabulary, we define the following terms:
an operation is atomic if it either completes entirely or not at all
all actions cause the table to transition from one valid state directly to another (eg a row will not disappear during an update, etc)
an operation is isolated if it appears to complete independently of any other concurrent transaction
any update that reports "successful" to the client will not be lost
an update is considered visible if any subsequent read will see the update as having been committed
These aren't even remotely close to the same definitions!
It's not at all clear what the NoSQL community is trying to do by re-defining all these words, and it's doubly not clear why the entire computing industry appears to be going along with it.
Why not define new terminology? Why change the meanings of words that have had precise definitions for about as long as general purpose computers have been in use?
Monday, April 9, 2012
To reinforce the material, the homework for week 4 included a practical question involving padding oracle exploits: a hypothetical web server request log was provided, containing evidence of an actual padding oracle attack.
In the request log, there were recorded the systematic requests that the attacker made of the web server, together with the web server's responses (HTTP 403 if the padding was wrong, HTTP 404 if the padding was correct).
The task posed by the question was: given this evidence, decrypt the original message!
As I worked through the problem, I kept getting the wrong answer, and finally realized my essential mistake: I thought that, once the attacker had brute-forced a single byte of the message by finding the corresponding IV byte that made the pad value correct, that the attacker had learned the corresponding byte of the plaintext.
But that is wrong, what the attacker has learned at this point is the corresponding byte of the decrypted ciphertext, which must then be XOR'd with the previous block's ciphertext (or the actual Initialization Value if this is the first block) in order to find the actual plaintext.
So I was finding the intermediate values correctly, but then there was one more step, to XOR them with the correct values from the originally captured ciphertext, to recover the plaintext.
While working my way through this, I came across this quite nice writeup by Brian Holyfield of Gotham Digital Science: Automated Padding Oracle Attacks With PadBuster.
If you are studying padding oracle attacks, and are looking for a clear description of how they work, with nice diagrams and examples, give Holyfield's article a try and see if it makes the technique a bit more clear.
... well, to be precise, "Bryan" the Nord is now a Level 40 character in Skyrim.
I've been playing for 3 full months now; approximately 250 hours of real time, according to the game stats. I think I've discovered more than half of the locations on the map, and have completed many quests. My highest attributes are pushing level 80 now, and I don't die nearly as much as I used to (especially since I hand-smithed my full set of Glass armor and improved it to Epic level quality).
I'm playing on the PS3, and happily haven't hit most of the issues that have required multiple patches. Or maybe I have hit those issues, but just didn't notice?
Even though I'm still having a blast playing the game, I can sense that I'm on the downward slope. I think I'll probably play the game a few more weeks, maybe through the end of April, but then the intense addiction will start to fade...
Sunday, April 8, 2012
After some dithering and uncertainty, I've once again signed up for this year's Google Summer of Code. It's hard to find time for this in my busy schedule, but it's a great program and I've enjoyed participating in it in the past.
We have four fine proposals for Derby, and only two mentors (counting me).
So at least two students are bound to be disappointed.
And, of course, there's no certainty that Derby will even get two students.
In past years, the Derby participation in GSoC has been great, both for the students and for the Derby project. The students I have worked with have gone on to successful careers, and I believe they have benefited from their time working with the Derby community.
Hopefully we'll have another good summer this year!
Wednesday, April 4, 2012
Oh dear. This experiment, although not really surprising, is still terribly disappointing.
Symantec researchers intentionally lost 50 smartphones in cities around the U.S. and in Canada. They were left on newspaper boxes, park benches, elevators and other places that passers-by would quickly spot them. But these weren't just any phones -- they were loaded with tracking and logging software so Symantec employees could physically track them and keep track of everything the finders did with the gadgets.
As the report suggests, the most important need here is to educate people and make them aware of the risk:
Maybe before they had a smartphone, losing an old cell phone was devastating but there wasn't much information on it. Maybe it’s like the frog in a pot of cold water that’s eventually boiled – it wasn’t that bad losing their old phone, so people haven't thought through how much information is now on their smart phones and what could happen if they lost it. We hope this research shows what could happen and sticks out in people's minds.
... in practice, however, theory and practice differ.
Or so goes the old saying.
Modern "web scale" distributed systems are full of some pretty neat theory, e.g.,
- Distributed consistent hash algorithms, for example, Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications, or Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web.
- Quorum replication storage systems, for example Dynamo: Amazon’s Highly Available Key-value Store, or Cassandra - A Decentralized Structured Storage System.
Although the theory is fascinating, and well worth reading, an interesting question in both of these cases involves how the algorithms are configured in practice. Both the DHT algorithms and the Quorum algorithms have various parameters involving: the total number of nodes in the system, and the rates at which these nodes arrive and leave the system.
Two recent papers explore these configuration choices in more detail:
- A performance vs. cost framework for evaluating DHT design tradeoffs under churn
- Probabilistically Bounded Staleness for Practical Partial Quorums
Both of these papers are strongly tilted toward the "practice" end of the theory/practice continuum, and for that I welcome and appreciate them.
The DHT paper explores the performance of several Distributed Hash Table algorithms in network scenarios that involve dynamic group membership changes:
DHTs incorporate many features to improve lookup performance at extra communication cost in the face of churn. It is misleading to evaluate the performance beneﬁts of an individual design choice alone because other competing choices can be more efﬁcient at using bandwidth. PVC presents designers with a methodology to determine the relative importance of tuning different protocol parameters under different workloads and network conditions. As parameters often control the extent to which a given protocol feature is enabled, PVC allows designers to judge whether a protocol feature is more efﬁcient at using additional bandwidth than others via the analysis of the corresponding protocol parameters.
The Staleness paper explores the behavior of systems that choose to forego the strong consistency guarantees of using overlapping read and write replica sets, and instead use partial quorum configurations:
Employing partial or non-strict quorums lowers operation latency in quorum replication. With partial quorums, sets of replicas written to and read from are not guaranteed to overlap: given N replicas and read and write quorum sizes R and W; partial quorums imply R+W <= N. Modern quorum-based data systems such as Dynamo and its open source descendants Apache Cassandra, Basho Riak, and Project Voldemort offer a choice between these two modes of quorum replication: overlapping quorums, providing strong consistency, and partial quorums, providing eventual consistency.
Neither of these are easy papers; in order to understand the papers, you have to start by understanding the systems and algorithms that are being studied by these papers.
However, if you're interested in the implementation choices faced by those who are building these modern web-scale systems, both these papers offer great insight and a host of new tools for studying and understanding system behavior.
I thought the movie adaptation of The Hunger Games was excellent.
Making a movie out of a book is always tricky business. There always turn out to be two audiences: some have already read the book, while others have not. You can't make the movie only for one of those audiences, or you'll lose the other.
I thought the makers of the Hunger Games movie did a wonderful job of straddling that fence. The movie explained enough of the details of the book so that the important parts of the story came through clearly, but it also didn't try to explain every detail, assuming (correctly) that people who had already read the book knew how to fill those parts in themselves, and would be pleased to explain any confusing aspects to their friends afterwards.
Collins was credited with contributing to the screenplay. I don't know how much of a contribution she made, in terms of hours and word-count, but it was quite clear that her presence ensured that the movie retained the same overall "feel" of the books.
I thought that the characters were fleshed out wonderfully. Rue was just as I had imagined her, as was Effie Trinkett. I had trouble envisioning Haymitch Abernathy while reading the books, but the movie draws him just right! Cinna was a treat, even if I didn't imagine him like that at all. And Caesar Flickerman! Wow! Yes!
The books, of course, already have an enormous audience, but I think it is quite likely that the movie brought enough additional attention to attract a whole new set of readers, who delighted in the movie and now want to dig into the books and read them all. I hope this happens; the books are fabulous and deserve whatever attention they are getting.
Myself, FWIW, am on Team Peeta; I suppose I identify with him more.