Journal of a Programmer: November 2010

Monday, November 29, 2010

Two tidbits of computer security news today

The New York Times has been digging into the WikiLeaks cable traffic, and reports that the leaked documents appear to confirm that China's Politburo ordered the Google hacking intrusions:

China’s Politburo directed the intrusion into Google’s computer systems in that country, a Chinese contact told the American Embassy in Beijing in January, one cable reported. The Google hacking was part of a coordinated campaign of computer sabotage carried out by government operatives, private security experts and Internet outlaws recruited by the Chinese government.

Meanwhile, Wired Magazine's Threat Level blog is reporting today that Iranian President Mahmoud Ahmadinejad appears to be confirming that the Stuxnet virus did in fact affect the operation of the nuclear-enrichment centrifuges at Iran's Natanz facility:

Frequency-converter drives are used to control the speed of a device. Although it’s not known what device Stuxnet aimed to control, it was designed to vary the speed of the device wildly but intermittently over a span of weeks, suggesting the aim was subtle sabotage meant to ruin a process over time but not in a way that would attract suspicion.

“Using nuclear enrichment as an example, the centrifuges need to spin at a precise speed for long periods of time in order to extract the pure uranium,” Symantec’s Liam O Murchu told Threat Level earlier this month. “If those centrifuges stop to spin at that high speed, then it can disrupt the process of isolating the heavier isotopes in those centrifuges … and the final grade of uranium you would get out would be a lower quality.”

The entire Threat Level report is fascinating; it reads like a movie script, but apparently it's real life.

TritonSort benchmark for Indy GraySort

25 years ago, Jim Gray started benchmarking sort performance, and the efforts continue, as sort performance is a wonderful tool for incrementally advancing the state of the art of systems performance. You can read more about the overall sort benchmarking world at sortbenchmark.org.

One of the 2010 winners, the TritonSort team at UCSD, have posted an overview article about their work on this year's benchmark. Although they don't have the complete technical details, the article is still quite informative and worth reading.

One of the particular aspects they studied was "per-server efficiency", arguing that in the modern world of staggeringly scalable systems, it's interesting to ensure that you aren't wasting resources, but rather are carefully using the resources in an efficient manner:

Recently, setting the sort record has largely been a test of how much computing resources an organization could throw at the problem, often sacrificing on per-server efficiency. For example, Yahoo’s record for Gray sort used an impressive 3452 servers to sort 100 TB of data in less than 3 hours. However, per server throughput worked out to less than 3 MB/s, a factor of 30 less bandwidth than available even from a single disk. Large-scale data sorting involves carefully balancing all per-server resources (CPU, memory capacity, disk capacity, disk I/O, and network I/O), all while maintaining overall system scale. We wanted to determine the limits of a scalable and efficient data processing system. Given current commodity server capacity, is it feasible to run at 30 MB/s or 300 MB/s per server? That is, could we reduce the required number of machines for sorting 100 TB of data by a factor of 10 or even 100?

Their article goes on to describe the complexities of balancing the configuration of the four basic system resources: CPU, memory, disk, and network, and how there continues to be no simple technique that makes this complex problem tractable:

We had to revise, redesign, and fine tune both our architecture and implementation multiple times. There is no one right architecture because the right technique varies with evolving hardware capabilities and balance.

I hope that the TritonSort team will take the time to write up more of their findings and their lessons learned, as I think many people, myself included, can learn a lot from their experiences.

Saturday, November 27, 2010

Paul Randal's amazing Bald Eagle pictures from Alaska

Paul Randal, who is best known for being one of the best writers and teachers about SQL Server, is also a tremendous photographer, and he recently posted some pictures from his trip to Alaska.

You have to see these pictures, they are simply superb:

http://www.sqlskills.com/BLOGS/PAUL/post/Bald-eagles-in-Alaska-(part-1).aspx

http://www.sqlskills.com/BLOGS/PAUL/post/Bald-eagles-in-Alaska-(part-2).aspx

http://www.sqlskills.com/BLOGS/PAUL/post/Bald-eagles-in-Alaska-(part-3).aspx

http://www.sqlskills.com/BLOGS/PAUL/post/Bald-eagles-in-Alaska-(4th-and-final-part).aspx

Not only are the pictures gorgeous, Paul includes some great notes about the process of learning to take pictures like these:

You don't need camoflauge clothing and lens wraps to get good wildlife shots (I think that stuff looks daft), you just need patience and an understanding of the wildlife behavior. We sat in the same small area for 6 hours a day and waited for the eagles to come to us.

Many thanks for sharing these pictures Paul, I enjoyed them very much!

Mobile broadband for a small team

Suppose you have 2 (or 3 or 4) people who want to travel together, working on a fairly large project (sufficiently large that you want local data and computing power, not just "the cloud"), and you want that team to be able to quickly and reliably set up and operate an small local area network for team computing, anywhere in the U.S. or Canada. You'll often be in suburbs or rural locations rather than right downtown at the fanciest modern hotels, so counting on hotel broadband seems rather iffy. And it may be weeks between stops at the mothership, so you need a reliable cloud-based backup provider that can handle data volumes in the tens of gigabytes (maybe even up to 250Gb). You'll be setting up and tearing this lab routinely, perhaps 10 times a week, so you want it to be as simple and reliable as possible.

What setup would you advise?

Here's what I've been exploring; I haven't constructed such a system, but am wondering where the holes would be. Can you poke some holes in this proposal and let me know?

Two laptops, each running Windows 7, each with 500GB hard disks. Perhaps
something in this range.

Something along the lines of the Verizon MiFi for reliable broadband connectivity (at least throughout North America)

Something along the lines of the Cisco Valet or the Airport Express for setting up a small internal network for file-sharing purposes.

Something like Mozy for reliable cloud-based backup, though I'm a little worried that the cloud-based backup schemes can't scale to dozens or hundreds of gigabytes.

To augment the cloud-based backup strategy, a couple of these stocking stuffers can be used for local standby backup purposes.

With this gear, I think you can quickly set up a small internal LAN for file-sharing support between the two laptops, and each laptop can get online at will via the MiFi.

The primary laptop, which holds the master copy of all the files, can share that folder with the secondary laptop, and I think Windows 7 file sharing is robust and reliable enough that the other secondary can continue accessing those files even while the "primary" laptop is busy running large computations (or playing the occasional game of Dark Lords of Elven Magic V).

The secondary laptop is over-provisioned, but this is intentional, so that if the primary laptop fails, the secondary can take over the primary's duties (after restoring the data from a combination of the local spare backup and the Mozy data from the cloud).

Am I crazy?

Wednesday, November 24, 2010

kernel.org upgrades their master machines

I found this article about the new kernel.org "heavy lifting" machines interesting. These are the machines with which

kernel.org runs the infrastructure that the Linux Kernel community uses to develop and maintain a core piece of the operating system.

It's a good snapshot of the state-of-the-art in provisioning a pretty substantial server machine nowadays.

The unwritten parts of the recipe

As any cook knows, recipes are just a starting point, a guideline.

You have to fill in the unwritten parts yourself.

So, may I suggest the unwritten parts that go with this recipe?

First, between each sentence of the instructions, add:

Drink a beer.

Second, at the end of the instructions, add:

Drink two more beers while you cook the turkey. Then look at the oven and realize you forgot to turn it on. It's ok, you didn't own a meat thermometer anyway. Throw the whole mess away and call out for pizza.

Tuesday, November 23, 2010

Ken Johnson's Exposition of Thread-Local Storage on Win32

The always-worth-reading Raymond Chen happens to be talking about Thread-Local Storage on Windows this week. In his essay, he references Ken Johnson's eight-part description of how Thread-Local Storage works, under the covers, using support from compiler, linker/loader, and the operating system, a description which is so good that it's worth linking to all of Johnson's articles right now:

Thread Local Storage, part 1: Overview

Thread Local Storage, part 2: Explicit TLS

Thread Local Storage, part 3: Compiler and linker support for implicit TLS

Thread Local Storage, part 4: Accessing __declspec(thread) data

Thread Local Storage, part 5: Loader support for __declspec(thread) variables (process initialization time)

Thread Local Storage, part 6: Design problems with the Windows Server 2003 (and earlier) approach to implicit TLS

Thread Local Storage, part 7: Windows Vista support for __declspec(thread) in demand loaded DLLs

Thread Local Storage, part 8: Wrap-up

There is no such thing as "too much information" when it comes to topics like "how does the magic behind __declspec(thread) actually work?" Johnson's in-depth explanations do the world a tremendous favor. Read; learn; enjoy!

Monday, November 22, 2010

memcpy, memmove, and overlapping regions

The C runtime library provides two similar-but-different functions:

memcpy

memmove

The primary distinction between these two functions is that memmove handles situations where the two memory regions may overlap, while memcpy does not.

And the memcpy manual pages describe this quite clearly:

If copying takes place between objects that overlap, the behaviour is undefined.

You might not think that is a very strong statement, but in C programming, when somebody says "the behavior is undefined", this is an extremely strong thing to say; here's a few articles that try to explain what we mean when we say "the behavior is undefined", but the bottom line is: if you program in C, you need to sensitize yourself to the phrase "undefined behavior" and not write programs which perform an action which has undefined behavior.

However, it so happens that, although memcpy has contained this statement forever (well, in programming terms at least!), one of the most common memcpy implementations, has until recently been implemented in a way which caused it to be safe, in practice, to call memcpy with overlapping memory regions.

But things have changed: https://bugzilla.redhat.com/show_bug.cgi?id=638477 There's a lot of interesting discussion in that bug report, but let me particularly draw your attention to Comment 31 and to Comment 38, and to Comment 46. Suffice it to say that the author of those comments knows a little bit about writing system software, and probably has some useful suggestions to offer :)

Happily, as several of the other comments note, the wonderful (though oddly-named) valgrind tool does a quite reasonable job of detecting invalid uses of memcpy, so if you're concerned that you might be encountering this problem, let valgrind have a spin over your code and see what it finds.

Sunday, November 21, 2010

Unidirectional Football

In American Tackle Football, each team defends its end zone, and attacks toward the other team's end zone, scoring a touchdown when it carries or passes the football across the goal line. At the end of each quarter of play, the teams swap end zones and face the other direction, to more-or-less equalize the advantages conveyed by one direction or the other.

Traditionally, that is how it is done.

Yesterday, though, the University of Illinois played Northwestern University in a Big 10 showdown, and the teams chose to play in Chicago's Wrigley Field.

Wrigley Field, of course, is a baseball field, the famous home of the Chicago Cubs; it is named for William Wrigley, the chewing gum magnate, who was part of the syndicate that brought the Cubs to Chicago and who owned the team in the 1920's.

It's not that unheard-of to hold a football game on a baseball field; for example, Notre Dame played Army yesterday in (the new) Yankee Stadium.

But it had been 40 years since a football game was played in Wrigley Field, and now we know why: the field was too small. After laying out the standard-sized football field on the grounds, there was no extra space left around the east end zone, and the playing field terminated with only 6 inches to spare before the brick wall that marks right field. After reviewing the layout:

the Big Ten said that the layout at Wrigley was too tight to ensure safe play. The conference instructed players to run offensive plays only toward the west end zone, except in the case of interceptions.

So, each time the ball changed possession, the players switched sides.

And, the teams shared a sideline, rather than being on opposite sides of the field.

There even was an interception, run back for a touchdown, by Northwestern safety Brian Peters.

Meanwhile, yesterday was also the Big Game in these parts, as Berkeley hosted Stanford in the annual classic. Stanford won easily this year: they have a phenomenal team and should finish the season in the top 5 nationally. After next week's Berkeley vs. Washington game, Berkeley's Memorial Stadium will be closed, and the long-delayed earthquake reconstruction project will begin in earnest. Memorial Stadium, which is situated in one of the most beautiful locations in the country, also happens to be right on top of the Hayward Fault, one of the most dangerous earthquake faults in California. So the stadium will be closed, and extensively overhauled to try to make it safer.

Meanwhile, the Golden Bears will play their 2011 season across the bay, in San Francisco's AT&T Park, home of the San Francisco Giants.

That's right, they'll be playing football all year in a baseball field.

I wonder if they'll play Unidirectional Football?!

Saturday, November 20, 2010

Koobface report

I spent some time reading Nart Villeneuve's fascinating report on the Koobface botnet. The report is well-written and clear, and although it's long, it doesn't take very long to read, so if you have the time, check it out. It's a detailed and broad-ranging investigation of one of the large crimeware systems infesting the Internet.

Many malware investigations look just at technical issues: vulnerabilities, exploits, defense mechanisms, etc. I love learning about that technology, but there is a lot more to malware than just the technology: social, political, and financial aspects are all part of modern organized crime on the Internet. The Koobface study is particularly worth reading because it does a good job of exploring many of these non-computer-science aspects of the malware problem. From the report's executive summary:

The contents of these archives revealed the malware, code, and database used to maintain Koobface. It also revealed information about Koobface's affiliate programs and monetization strategies. While the technical aspects of the Koobface malware have been well-documented, this report focuses on the inner workings of the Koobface botnet with an emphasis on propagation strategies, security measures, and Koobface's business model.

Wait, botnets have a business model?

Well, of course they do.

For far too long, media and popular culture have categorized malware as originating from either:

A lone, socially-maladjusted, brilliant-but-deranged psychopathic individual, who for reasons of mental illness constructs damaging software and looses it upon the world, or

A governmentally-backed military organization, which thinks of computers, networks, and information in attack-and-defense terms, and operates computer security software for military purposes.

While both these categories do exist, a major point of the Koobface report is to show that the category of modern organized crime is at least as important in the spread and operationg of malware on the net, and to help us understand how those crime organizations operate malware systems for profit.

The report is divided into two major sections:

The Botnet

The Money

The first section deals with operational issues: "propagation strategies, command and control infrastructure, and the ways in which the Koobface operators monitor their system and employ counter-measures against the security community".

The second section explains "the ways in which the Koobface operators monetize their activities and provides an analysis of Koobface's financial records".

The report ends with some social and political analysis and offers some recommendations to law enforcement and security organizations about how they can evolve to address these evolving threats.

Let me particularly draw your attention to the second section, "The Money".

It is absolutely fascinating to understand how botnets such as these profit, by providing a business model that is almost, yet not quite, legitimate, and how close it is to the core business models that are driving the Internet:

The Koobface operators maintain a server ... [ which ] ... receives intercepted search queries from victims' computers and relays this information to Koobface's PPC [pay-per-click] affiliates. The affiliates then provide advertisement links that are sent to the user. When the user attempts to click on the search results, they are sent to one of the provided advertisement links...

That's right: Koobface operates, and makes money, by doing essentially the same things that core Internet companies such as Microsoft, Google, and Yahoo do:

Provide search services

Provide advertising services

Match individuals searching for items with others who are offering products

The report links to a great Trend Micro blog explaining this "stolen click" technique, also known as "browser hijacking", in more detail:

Browser hijacker Trojans refer to a family of malware that redirects their victims away from the sites they want to visit. In particular, search engine results are often hijacked by this type of malware. A search on popular search engines like Google, Yahoo!, or Bing still works as usual. However, once victims click a search result or a sponsored link, they are instead directed to a foreign site so the hijacker can monetize their clicks.

The history of organized crime is long and well-researched; I have nothing particular to contribute to this, and it's not my field. However, I find it very interesting to learn about it, and I hope that you'll find it worthwhile to follow some of these references and learn more about it, too.

Now it's time to "pop the stack" and get back to studying the changed I/O dispatching prioritization in Windows 2008 Server as compared to Windows 2003 Server. Ah, yes, computer science, yummm, something I understand... :)

Wednesday, November 17, 2010

I feel the need ... for speed!

Here's a very nice writeup of a recent speed cubing event. I love the Feliks Zemdegs video, can't take my eyes off it!

I am not a very fast cuber. I can solve the standard 3x3x3 cube, but it usually takes me 2+ minutes, and more if I get distracted by my granddaughter :)

When I was (much) younger, John and I got into a spirited competition of speed Minesweeper, expert level. We set up a computer in the break area and we would alternate back and over during compile times, trying to break each other's best time record. Of course, I see that the world has progressed since then...:)

Nowadays my compile-and-test turnaround cycle at my day job is so lightning-fast, I barely have time to bounce over to my favorite Chess or Go sites before it's time for the next bit of work. That's progress!

The Java/JCP/Apache/TCK swirl continues

There's lots of activity as people try to figure out what is going on with Java, where Oracle is taking it, what is the future of the Java Community Process, etc. Here's a quick roundup of some recent chatter:

At eWeek, Darryl Taft reports on recent JCP election news, and includes some commentary from Forester analysts John Rymer and Jeffrey Hammond.

Hammond also told eWEEK:

“Right now Oracle holds all the cards with respect to Java, and if they choose to close the platform then I don’t think there’s much anyone can do about it. Some customer might actually be more comfortable with that in the short term if it leads to renewed innovation. In the long term I think it would be counterproductive, and hasten the development of Java alternatives in the OSS community – and I think Apache would be happy to have a role in that if things continue along their current path.”

...

Forrester’s Rymer also points to the business side of things when he says:

“One thing that puzzles me is IBM’s role in this dispute. IBM has been a big backer of ASF and its Java projects, including Harmony. We think IBM turned away from Harmony in ‘renewing’ its partnership on Java with Oracle. ASF’s ultimatum to Oracle must be related to IBM’s move, we just don’t know exactly how. I expected that IBM would continue its strong support of ASF as a counterweight to Oracle."

On the GigaOm website, Canonical's Matt Asay posts a long in-depth analysis, including a call for Oracle to communicate its intentions widely:

Oracle needs to head off this criticism with open, candid involvement in the Java community. It needs to communicate its plans for Java, and then listen for feedback. Oracle needs to rally the troops around an open Java flag, rather than sitting passively as Apple and others dismiss Java, which is far too easy to do when Java comes to mean “Oracle’s property” rather than community property.

On the Eclipse.org site, Mike Milinkovich of IBM posts a hopeful view from the Eclipse perspective during the run-up to the JCP election:

The Eclipse Foundation is committed to the success of both Java and the JCP, and we are optimistic that the JCP will remain a highly effective specification organization for the Java community and ecosystem.

The Eclipse Foundation was one of the organizations that was re-elected to the JCP executive committee.

Eduardo Peligri-Llopart, a long-time Java EE voice from Sun, posts a nice article talking about the complexities of communicating Oracle's JVM strategy, as an insider at Oracle trying to help that process occur. It's nice to see him continuing to try to be a voice communicating Oracle's decision-making processes as they are occurring.

And Stephen Colebourne has an excellent 3-part series of articles analyzing:

So, the swirl continues. There's lots to read. The Java community continues to evolve, and software continues to get written. If you have pointers to more information about what's going on and what it all means, send them my way!

My cursor disappears when using GMail in Safari

I'm trying the workaround described by Adam Engst at the TidBITS web site, hopefully that will do it for me.

Tuesday, November 16, 2010

Google 1, Harvard 0

Don't miss this fascinating article by Matt Welsh relating his decision to retire from Harvard in order to join Google.

It's well worth your time to read through the comments as well, as there are lots of interesting follow-ups and related discussions.

Update: Dean Michael Mitzenmacher wrote a follow-up essay of his own, which is also posted, and also worth reading.

Monday, November 15, 2010

CUDA in the cloud

Amazon have announced that EC2 now supports GPU clusters using CUDA programming. That might just be a bunch of gobbledygook so let's expand a little bit:

EC2 is Amazon's Elastic Compute Cloud, one of the leaders of cloud computing services.

GPUs are Graphics Processing Units, the specialized computers that sit on the 3D video card in your computer. Your computer arranges to have video processing done by the GPU, while regular computing is performed by your machine's CPU, the Central Processing Unit. Here's an example of a GPU, the NVidia Tesla M2050.

CUDA is a specialized programming language designed for the task of offloading certain compute tasks from your CPU to your GPU. It originated with NVidia but has been used for some other GPU libraries as well. Here's the starting point for learning more about CUDA.

So Amazon are announcing that their cloud infrastructure has now provisioned a substantial number of machine with high-end GPU hardware, and have enhanced their cloud software to make that hardware available to virtual machine instances on demand, using the CUDA APIs for programming access, and are ready for customers to start renting such equipment for appropriate programming tasks.

And now you know enough to understand the first sentence of this post, and to appreciate Werner Vogels's observation that "An 8 TeraFLOPS HPC cluster of GPU-enabled nodes will now only cost you about $17 per hour." Wow! Let's see, an hour has 3600 seconds, so that's about 25 PetaFLOPS / hour, so we're somewhere around 1 PetaFLOP = $1 / hour, is that right?

Sunday, November 14, 2010

Nice Netflix paper on High-Availability Storage

Sid Anand, a Netflix engineer who writes an interesting blog, recently published a short, very readable paper entitled Netflix's Transition to High-Availability Storage Systems. If you've been wondering about cloud computing, and about who uses it, and why, and how they build effective systems, you'll find this paper quite helpful.

The paper packs a lot of real-world wisdom into a very short format. I particularly liked this summary of what Anand learned about building highly-available systems while at eBay:

Tables were denormalized to a large extent

Data sets were stored redundantly, but in different index structures

DB transactions were forbidden with few exceptions

Data was sharded among physical instances

Joins, Group Bys, and Sorts were done in the application layer

Triggers and PL/SQL were essentially forbidden

Log-shipping-based data replication was done

It's an excellent list. At my day job, we spend a lot of time thinking about how to build highly-available, highly-reliable systems which service many thousands of users concurrently, and you'd recognize most of the above principles in the internals of our server, even though the details differ.

To avoid your data being your bottleneck, you have to build infrastructure which breaks that bottleneck:

replicate the data

carefully consider how you structure and access that data

shift work away from the most-constrained resources

It sounds very simple, but it's oh-so-hard to do it properly. That's what still makes systems like eBay, Netflix, Amazon, Google, Facebook, Twitter, etc. the exception rather than the rule.

It should be no surprise that so many different engineers arrive at broadly similar solutions; using proven basic principles and tried-and-true techniques is the essence of engineering. Although my company is much smaller than the Internet giants, we are concerned with the same problems and we, too, are sweating the details to ensure that we've built an architecture that scales. It's exciting work, and it's fun to see it bearing fruit as our customers roll out massive internal systems successfully.

I enjoyed Anand's short paper, and I've enjoyed reading his blog; I hope he continues to publish more work of this caliber.

Now, back to solving the 7-by-7 KenKen while the bacon cooks... :)

Saturday, November 13, 2010

Wang Hao apologizes for winning and moving into a tie for the lead

At the fascinating and action-packed Tal Memorial chess tournament (the official website is in Russian, natch) rising Chinese chess star Wang Hao won his game with Boris Gelfand when Gelfand unexpectedly resigned in a position that was not clearly lost.

In fact Wang Hao felt a bit strange about it. “I was very lucky,” he repeated a few times in the press room, and even started to apologize for playing on in the ending so long. Unnecessary apologies of course, if only because a draw offer isn’t allowed at this tournament anyway.

Read all about it in a great Round 7 report at ChessVibes.com, or check the game itself out at Alexandra Kosteniuk's ChessBlog.com.

Apple contributes their Java technology to OpenJDK

According to this press release, Apple will be contributing their Java technology to OpenJDK.

The Register's take is that

In effect, Oracle has recognised it needs Apple - and by extension, its Mac Apps Store - to be a friend... hence today's happy-clappy OpenJDK love-in.

MacWorld's Dan Moren observes that

Apple seems to be hoping that it's progressing to a point where Flash and Java aren't critical technologies for most Mac users--and that users who do need those technologies will be more than capable of downloading and installing them themselves

Henrik Stahl has a short blog post, encouraging interested parties to join the OpenJDK community, or to apply for a job at Oracle, and noting:

This announcement is the result of a discussion between Oracle and Apple that has been going on for some time. I understand that the uncertainty since Apple's widely circulated "deprecation" of Java has been frustrating, but due to the nature of these things we have neither wanted to or been able to communicate before. That is as it is, I'm afraid.

It's not clear that this sort of behavior from the Oracle-IBM-Apple triumvirate is going to have people chanting "who put the 'open' in 'OpenJDK'?"...

Still, it's better news for the future of Java-on-the-Mac than might have been, so fans of Java are pleased, I'm sure.

Thursday, November 11, 2010

As the Pac-10 becomes the Pac-12, what happens to the B-Ball schedules?

Currently, the Pac-10 basketball schedule contains 18 conference games: each team plays each other team twice, home-and-away.

The Pac-10 is adding two new teams: Colorado and Utah, to become the Pac-12. So how does this affect the schedule? If each team were to play each other team home-and-away each year, that would be a 22 game conference schedule!

Well, because I knew you were all fascinated about this little detail, I went and found out the answer:

The Pac-10 Conference announced today the specific details surrounding the 2011-12 Pac-12 men’s basketball schedule, as well as the 10-year rotation model for scheduling.

The conference schedule will continue to be comprised of 18 games for each institution and will maintain the travel partner in a non-divisional format. Each year, the schedule will include games against an institution’s traditional rival both home and away, which means that Cal and Stanford will continue their annual home-and-home series. In addition, each school will play six other opponents both home and away (for two consecutive years), and four opponents on a single-game basis-two at home and two away. Those single-play opponents will rotate every two years.

For the full details, you can already find next year's conference schedule here!

Meanwhile, to balance all that press-release content with some content of my own, here's my brief trip report from last night's exhibition game between California and Sonoma State:

106 points! Woo-hoo!

The good news:
- Harper Kamp is back! He seems to have lost about 25 pounds, but he
still seems big enough to play inside, and he looked energetic. And
he still always seems to be in the right place at the right time.
I'm quite happy that he was able to recover from that injury, and he
looked just overjoyed to be back out playing again.

- Jorge will be a great point guard. He is confident and doesn't get
rattled, and he passes well. And he can get to the hoop when he needs to.

- Allen Crabbe has potential to be very exciting. And Gary Franklin
and Richard Solomon look good, too. Crabbe and Solomon were high-school teammates.

The scary news is that the team is incredibly young and raw. They looked
like deer trapped in the headlights far too often. They are going to
get absolutely SHELLACKED a few times before they get some experience.
With only 3 experienced players, and with Sanders-Frison still suffering
from that foul trouble issue, it's going to be up to Gutierrez and Kamp
to hold the team together for the first few months.

If Gutierrez can stay healthy enough to play 35 minutes a game, and if
Montgomery can keep the team's morale up through a few early season
whompings (Kansas 114, Cal 74 ??), it will be fun to watch coach put the
new team together.

Tuesday, November 9, 2010

Lots of Oracle action at the Federal Building in downtown Oakland

All the legal action in the software industry this week is concentrated at the Federal Building in downtown Oakland, where the Oracle vs SAP lawsuit over TomorrowNow is taking place. Yesterday, Larry Ellison was "the star witness at the start of the second week of trial in Oracle's copyright infringement suit against the German software giant."

It's kind of hard to get a handle on this case. SAP attorneys say the award should be somewhere around $40 million, while Oracle attorneys contend the amount should be $2 billion, or perhaps even $4 billion. The judge, meanwhile, has already started reducing the amount that Oracle can claim.

I'm also a bit confused about which software, precisely, SAP was illegally accessing. I don't think it was the core Oracle DBMS software; maybe it was the Oracle applications suite, which contains many different packages, including software that Oracle originally wrote themselves as well as products they bought as part of buying Siebel, PeopleSoft, Hyperion, etc. This seems to be the case, according to this description of Safra Catz's testimony about how SAP was using the TomorrowNow program to try to lure customers away from Oracle:

Catz testified that she believed that the efforts to assure customers of Oracle’s level of support would keep them from fleeing Oracle when it came time to renew licenses. As for the arrival of SAP and TomorrowNow, she said, “I’d hoped and believed it would not be material.” Of the estimated 14,000 customers Oracle obtained from its acquisitions of PeopleSoft and Siebel, only 358 went to TomorrowNow, SAP’s lawyers argued.

And then there's the side theater over the subpoena of Leo Apotheker, the former SAP boss who is now the top man at HP. Is he in California? Is he in Germany? Is he in Japan? Were private eyes really hired? Has he been found?

Meanwhile, Chris O'Brien, at the San Jose Mercury News, says that this whole trial isn't even really about SAP, which is a fading figure in Oracle's rear-view mirror, but rather is about those companies that Oracle has in their headlights: HP, Google, Apple, IBM, etc.:

HP should have seen this coming from the day Redwood City-based Oracle bought Sun Microsystems and put itself in direct competition with the Palo Alto tech giant. And it should have expected nothing less from Ellison, Silicon Valley's most cunning corporate fighter, one who draws his energy and focus by creating a clearly defined enemy.

It's all riveting for those of us who watch the industry. But it isn't just entertainment: these are real adversaries, prosecuting real lawsuits for real money, and the implications are likely to fundamentally re-shape the enterprise software industry.

Monday, November 8, 2010

NYT article on Microsoft's anti-piracy team

This weekend's New York Times brought a long, detailed, and fascinating article entitled: Chasing Pirates: Inside Microsoft's War Room.

The article begins by describing a raid on a software piracy operation in Mexico, and what was discovered:

The police ... found rooms crammed with about 50 machines used to copy CDs and make counterfeit versions of software ...

The raid added to a body of evidence confirming La Familia's expansion into counterfeit software as a low-risk, high-profit complement to drugs, bribery and kidnapping.

The article describes Microsoft's extensive world-wide anti-piracy efforts:

Microsoft has demonstrated a rare ability to elicit the cooperation of law enforcement officials to go after software counterfeiters and to secure convictions -- not only in India and Mexico, but also in China, Brazil, Colombia, Belize and Russia. Counteries like Malaysia, Chile and Peru have set up intellectual-property protection squads that rely on Microsoft's training and expertise to deal with software cases.

At times the article reads like a spy thriller, talking about "undercover operatives" who are training in "hand-to-hand combat", but mostly the article spends its time in the back office, describing the underlying techniques of intelligence-gathering operations and anti-piracy coding and manufacturing techniques:

Through an artificial intelligence system, Microsoft scans the Web for suspicious, popular links and then sends takedown requests to Web service providers, providing evidence of questionable activity.

"We're removing 800,000 links a month", say the Microsoft anti-piracy team. That's a lot of links! Unfortunately, the article doesn't really describe how this process works -- surely it's not feasible to individually examine 800,000 links each month in a manual fashion, but if not, then how do you know that the links are indeed illegal and deserving of such immediate action?

Later in the article, the author is perhaps being fanciful and florid, or else is describing a lot of technology that I wasn't aware yet existed:

Mr Finn talks at length about Microsoft's need to refine the industry's equivalent of fingerprinting, DNA testing and ballistics through CD and download forensics that can prove a software fake came from a particular factory or person.

Is this just metaphor? Or do "CD and download forensics" exist, providing such a capability? I could imagine that various network logging occurs along the major network paths, such as at ISP access points, at sub-net border crossings, etc. And I could imagine that various digital signature techniques, often referred to by names such as "Digital Watermarking", could identify each binary bundle uniquely. Still, it's a long way from technology like this to proof that "a software fake came from a particular factory or person."

Later in the article, a few more details are provided:

A prized object in the factory is the stamper, the master copy of a software product that takes great precision to produce. From a single stamper, Arvato can make tens of thousands of copies on large, rapid-fir presses.

Crucially for Mr. Keating, each press leaves distinct identifying markers on the disks. He spends much of his time running CDs through a glowing, briefcase-size machine -- and needs about six minutes to scan a disk and find patterns. Then he compares those markings against a database he has built of CD pressing machines worldwide.

This sounds much less like a software technique, such as Digital Watermarking, and much more like a hardware technique involving the analysis of physical properties of the CD or DVD. Indeed, the article's earlier description of "ballistics" and "forensics" seems like a valid metaphor, similar to how we hear that firearms experts can match a bullet fragment to the gun from which it was fired.

It sounds like an arms race between the software publishers and the pirates:

To make life harder for the counterfeiters, Microsoft plants messages in the security thread that goes into authenticity stickers, plays tricks with lettering on its boxes and embosses a holographic film into a layer of lacquer on the CDs.

As I said, the article is long, detailed, and contains many interesting ideas to follow up on. Besides the discussions of technology and its uses, the article talks about public policy issues, varying intellectual property attitudes, training and outreach, public relations impacts, and more.

I found the article worth the time; if you know of more resources in this area to learn from, drop me a note and let me know!

Sunday, November 7, 2010

Visitors from Lompoc!

My brother and his family came up from Lompoc for a rainy San Francisco weekend. Luckily, the rain held off until Sunday and we enjoyed a beautiful Saturday playing with the kids.

That's me on the bottom right holding my nephew Everett, my dad behind me. My brother is bottom left, with my niece Amelia on his shoulders. My son is bottom center, with his niece on his shoulders.

It was a great weekend, can't wait for another!

Thursday, November 4, 2010

FTC appoints Felten

I have never met Professor Felten, but I am a regular and devoted reader of his writings, and read many of his student's writings as well.

I think the FTC has made an excellent choice to help them understand technology issues.

I only hope that the professor continues to be able to be as open and free about his findings as he has been in the past.

Wednesday, November 3, 2010

Wired looks at the technology behind Kinect

On the eve of tomorrow's release of the Xbox Kinect, Wired has a nice illustrated writeup of the basic ideas behind the technology, with plenty of links to further reading.

...

(Update:) Then, the next morning, Wired follows up with a review titled "Flawed Kinect Offers Tantalizing Glimpse of Gaming's Future":

For hard-core gamers, Kinect is a box full of potential, offering tantalizing glimpses at how full-body control could be used for game designs that simply wouldn’t work any other way. But at launch, the available games get tripped up by Kinect’s limitations more than they are liberated by the control system’s abilities.

Microsoft, HTML5, and Silverlight

Last week was the 2010 edition of Microsoft's Professional Developers Conference, an always-important computer industry event which is held somewhat irregularly -- I think Microsoft's official position is that they hold it when they have something important to tell their developer partners, and when there isn't anything to say, they don't hold the event. And sometimes the conference is a full week, other times it is 3-4 days, and other times it is just 2 days long. Another interesting aspect of this year's PDC was that Microsoft held it at home, in the company's own internal conference center, rather than renting out a large commercial center in Los Angeles like they had been doing for a number of years, and encouraged developers to tune in "virtually" (by viewing real-time video coverage) if they couldn't or didn't wish to attend in person.

Anyway, they held the 2010 event last week, and while I wasn't there, I've been reading about it on the net.

The biggest murmur of excitement appeared to be related to the ongoing elevation of HTML5 and IE9 as the company's long-term web application platform of choice. I think that Mary Jo Foley's column about the Microsoft strategy seemed to be frequently mis-read; people seemed to think that her column said things that it didn't actually say. Peter-Paul Koch has a nice writeup of his take on the matter, with pointers to several Microsoft follow-up articles from Bob Muglia, and Steve Ballmer. As Koch says:

What happened is not an abandonment of Silverlight; far from it. Microsoft has big plans with it — and who knows, they might even work. What happened is that Microsoft placed HTML5 on an equal footing with Silverlight.

Oh, and this is not about desktop. Desktop is almost an afterthought. It’s about mobile.

There is a lot of activity in the mobile web application space nowadays, with Apple, Google, and Microsoft all making major pushes this year, and Oracle's Java team at least trying to stay involved as well. It's a lot for the poor developer to keep track of, especially for a guy like me, who is basically a server guy at heart, but who tries to stay up-to-date on other important technologies as much as I can.

If you haven't yet had a chance to learn about HTML 5, well, shame on you! It's long past time that you should learn about this; it's the most important thing going on in the computer world right now. Here's a great place to get started: http://slides.html5rocks.com/ Or move straight on to Mark Pilgrim's thorough and clear documentation at http://diveintohtml5.org/

OK, that's enough of that; back to working on that server resource management bug that's been eating at me for a week...

Has anyone seen the black Bishop or the white Knight?

In Puke Ariki, New Zealand, a status report on the last two missing pieces of the town's outdoor chess set.

Tuesday, November 2, 2010

Google expands its Bug Bounty program

The always interesting Brian Krebs is reporting today that Google is expanding their Bug Bounty program.

As Krebs observes, Google isn't the only organization with a Bug Bounty program, here's Mozilla's Bug Bounty page. According to this article from MaximumPC, Microsoft still isn't onboard with the idea, though.

I think that Bug Bounty programs are interesting. I think it's a good way to show people you care about quality, and obviously Google feel that it is an effective way both to get useful feedback and to reward people for helping Google improve their software.

Of course, this tradition has a long history: I'm reminded of the famous Knuth reward check, which I've written about before.

What other innovative ways are there by which companies are working with their customers to improve software quality? Drop me a line and let me know!