Monday, October 5, 2009

On the non-scalability of the Ant fileset task

There's an old grade school joke: "will everybody who's not here please raise your hands!"

We've been fighting with a fairly annoying problem in our build scripts: build jobs have been failing with a fairly terse error:


BUILD FAILED
java.lang.OutOfMemoryError


Well, that didn't give us much information, so we tried running with 'ant -debug', and we got a little bit more:


BUILD FAILED
java.lang.OutOfMemoryError
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1225)
at org.apache.tools.ant.Project.executeTarget(Project.java:1185)
at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:40)
at org.apache.tools.ant.Project.executeTargets(Project.java:1068)
at org.apache.tools.ant.Main.runBuild(Main.java:668)
at org.apache.tools.ant.Main.startAnt(Main.java:187)
at org.apache.tools.ant.launch.Launcher.run(Launcher.java:246)
at org.apache.tools.ant.launch.Launcher.main(Launcher.java:67)
Caused by: java.lang.OutOfMemoryError
--- Nested Exception ---
java.lang.OutOfMemoryError


This wasn't a lot better, unfortunately, as the code in Project.java is afflicted with one of the great sins of Java coding: wrapping the inner exception and losing the good stuff:

} catch (Throwable ex) {
if (!(keepGoingMode)) {
throw new BuildException(ex);
}
thrownException = ex;
}

All of the good information is in the inner Throwable, but Ant only reports the information from the BuildException.

Sigh.

Eventually, I figured out that the error was coming from what I thought was a fairly simple Ant target, to remove unwanted JUnit output from my build tree:


<target name="cleanTests">
<delete>
<fileset dir="${SRCROOT}" includes="**/TEST-*.txt">
</delete>
</target>


At this point my colleague Tom got the great idea of re-running the Ant command with the -XX:+HeapDumpOnOutOfMemoryError flag so that we could get a memory dump when we ran out of memory.

Once we had the memory dump, a few minutes with the ever-wonderful Eclipse Memory Analyzer tool made the problem obvious.

It was clear that running this target uses memory proportional to the size of my entire source tree, rather than using memory proportional to the number of JUnit test output files in my tree, which is what I expected.

Reading the heap dump, we discovered that all the memory was consumed by the 'filesNotIncluded' and 'dirsNotIncluded' Vector objects in the DirectoryScanner class.

That's right: in addition to computing the set of files that match the pattern that I specified in my fileset specification, Ant is also computing the set of files that don't match the pattern.

Will everyone who is not here please raise their hands?

Is there actually a use for this information? I've never seen this behavior from the Ant fileset task, and right now it just seems annoying, but perhaps I'm just unaware of the beneficial reason for having it?

No comments:

Post a Comment