Wednesday, October 21, 2009

Data visualization and charting with YUI

(I fear that title is going to promise more than I can deliver.)

I've been working, with several colleagues, on a suite of multi-machine performance tests. These tests exercise a large and complex distributed system, with a workload which is accomplished by routing work requests around to the various servers on those machines.

At the very beginning, the goal was just to get the suite to run. At all.

Then, we moved on to measuring the overall elapsed time of the complete benchmark.

The next step was to try to break down the behavior of the individual machines during the benchmark, so we could understand which machine was the bottleneck at various points during the test.

At this point, I started working with the Windows TypePerf tool, which is a marvelous low-level tool for gathering performance information. I enhanced our harness so that each machine has a TypePerf instance running in the background, gathering performance data and saving it to a file.

This means that each machine gathers low-level data that looks like this:


"10/20/2009 08:01:44.272","0.000000","1114927104.000000","6688.000000","0.009847","313399.348675","54.226956"
"10/20/2009 08:01:45.272","0.000000","1120014336.000000","6674.000000","0.125010","123076.715079","40.229407"
"10/20/2009 08:01:46.272","0.000000","1137823744.000000","6649.000000","0.069106","194001.835504","71.091345"


This is powerful data, but very low-level. Each line records the activity level on that machine at that time, in the areas of disk activity, memory availability, CPU usage, network I/O, etc. For the first pass, we were interested in the patterns of CPU usage on the various machines during the time of the test.

So the next thing I did was to write analysis software which took a collection of these samples, one per machine in the test, and correlated and aggregated the data up into a single HTML table, so that the table has:

  • a row for each minute

  • a column for each machine in the test

  • and the contents of the table cell for that row/column pair is the average CPU usage on that machine during that minute



The HTML table then looked something like this:








































DateNode1Node2Node3Node4Node5Node6Node7Node8Overall
10/20/2009 05:260.00.00.05.50.030.80.036.823.7
10/20/2009 05:270.00.00.01.40.00.90.00.30.9


This is much better, but still fairly low level.

So at this point, I turned to the marvelous YUI charting tool.

The YUI charting tool, in its standard invocation, knows how to take a simple HTML table and display it as a YUI chart. Pretty much all you have to do is:

  • Define an HTML Table DataSource which points at your HTML table

  • Define a YUI Line Chart widget, and feed it your HTML DataSource



It's barely a dozen lines of JavaScript; YUI does all the heavy lifting. Chris Heilmann's blog is a great overview of the process, with clear examples to show you the results.

And lo! We have a elegant line chart which brings the patterns in the data instantly to the front.

It's very enjoyable to get so much value from such a small amount of code. Way to go YUI!

No comments:

Post a Comment