Saturday, April 25, 2009

Derby XPLAIN is in the trunk

Today I committed the first set of changes for DERBY-2487, the new XPLAIN feature.

This is a new Derby feature. Here's how I described it in the commit comment:

This feature introduces an alternate handling of runtime statistics information. Derby can now be configured so that it will capture the statistics in a machine-readable form and will store them into a set of simply-structured tables in a schema which is specified by the user. We call this behavior "XPLAIN style", and we call the tables which are used the "XPLAIN tables".

Having captured statistics about statement execution, you can then analyze the statement behavior by querying these tables. For example, you can determine how much time was taken, what resources were used, what query plan was chosen, and so on.
The evolution of this feature is a bit interesting. The primary work on it was done by Felix Beyer, a student at the Technical University of Dresden, in Germany, in 2006-2007. You can find a good summary of the initial discussion in the mailing list archive.

Felix contributed the work to the Derby community in the spring of 2007, but then it sort of fell through the cracks. Nobody picked it up right away, although some people looked at it a bit (including me). To tell the truth, it was a bit intimidating, as it was a large patch.

I came across the work again this winter, when I was thinking about possible projects for the Google Summer of Code. I decided to have a look at the work, and see what could be done with it. I thought that the quality of the code was quite reasonable, and there was clearly a lot of value in the functionality. There were two problems that had to be solved:
  1. We had to ensure that we had the legal right to incorporate this work into Derby. Although it had been posted to Derby's bug-tracking system, with the appropriate "grant" flags checked, the Derby community was a bit concerned because of the size of the work, and wanted Felix to sign the Apache Individual Contributor License Agreement, which is the standard Apache licensing document.
  2. The patch had been sitting "on the shelf" for more than 2 years, and unfortunately it no longer cleanly applied to the trunk. Derby development had moved on in the interim.
We were able to resolve both of these issues, and once I restored the patch to working order, I discovered that it functioned quite well. I spent some time writing some draft documentation, and some regression tests, and got the patch ready for the community to evaluate.

There were several significant issues raised, but the most important issue was that the patch functioned by storing query statistics information into a fixed set of system catalogs, which caused us concern for several reasons:
  • It meant that all Derby databases would incur the overhead of the new system catalogs, even if they never used this feature.
  • It meant that it was hard to isolate different uses of this feature in the same database. Although the system catalogs included information about the user, session, and transaction which captured the data, it was still complex to interpret the data when multiple sessions were captured.
  • It meant that it was hard to manage the data after it was captured. For example, it was hard to delete some, but not all, of the captured statistics.
After some discussion in the community, we decided to adjust the patch slightly so that it captured the statistics into simple user tables in a user-specified schema rather than using true system catalogs. This ended up simplifying the patch considerably, since system catalogs need a lot of special code, and seems to be working well.

The other major concern that was raised involved the details of the visitor pattern used for traversing the statistics tree and building the data to be stored in the tables. The implementation was visiting each node in a clean fashion, but the construction of the data was being done with a single complicated subroutine which used "if (instanceof)" tests to handle each node in a separate fashion. Rick Hillegas suggested a refactoring in which each node was responsible for capturing its own data to be stored in the xplain tables. This, too, ended up simplifying the patch considerably and appears to be working well, although I'm still concerned that there is too much coupling between the XPLAIN implementation and the ResultSetStatistics subclasses.

Overall, the commit seems to have gone well. The Derby community is amazing! Within a few hours of the commit, Knut noticed that I had overlooked a test which is only run in the JDK 6/JDBC 4.0 environment, and determined that a small change was needed to that test to allow for the presence of the new system procedures GET_XPLAIN_SCHEMA and GET_XPLAIN_MODE. Overnight while I was sleeping, he had already patched the test so that it passes.

2 comments:

  1. What did Felix Beyer think of the changes?

    ReplyDelete
  2. I haven't heard much from him except for the legal paperwork. I think that, for the most part, he was glad that others found his work interesting, but in the intervening years he's moved on to other work.

    ReplyDelete