Friday, February 11, 2011

No Bryan, there is not a global conspiracy against you

I had a great time this morning; or, at least, about as much fun as a programmer can have.

I was debugging some new code I'd written, in a networked server process which contains a master loop that more-or-less looks something like the following:

... various code to set up variables ...

do {

... accept a new connection from a client ...

... fork a new process to handle that client ...

} while ( ! done );

The code is massively simplified, but for the purposes of this article it doesn't matter.

What does matter is the behavior that I saw, which was most puzzling:

I placed a printf() statement above the top of the do ... while loop, in the "set up variables" section.

That printf was executed each time I accepted a new connection!

Well, OK, I've sort of given it away, but I admit I was sorely puzzled: how could the printf statement be executed on each new connection, when that code wasn't even inside the loop?

OK, here is a reasonable place to stop and think for a second, to avoid spoiling the fun too much.

Have you figured out the answer?

Here it is: the printf statement actually wasn't executed each time through the loop. It was only executed once. But, the printf statement was buffering its output, and the buffered output was present in the process's memory space, and when the process forked a new child process, the fork system call naturally duplicated the buffered printf output, and then when the child process executed a completely unrelated printf call of its own, it also flushed (and hence apparently re-executed) the buffered output from the parent!

So, everything was fine, it was just that the interaction of a buffered and un-flushed bit of printf output, and the thorough duplication of process state by the fork API, made me think that the code was being run twice, when in fact it was just the data that was being duplicated.

As the wonderful Raymond Chen says:

When something stops working, you begin developing theories for why it doesn't work, and normally, you start with simple theories that involve things close to you, and only after you exhaust those possibilities do you expand your scope. Typically, you don't consider that there is a global conspiracy against you, or at least that's not usually your first theory.

When programming, and debugging, it's so easy to convince yourself that your theory is correct, and to find ways to force the evidence to match your theory. So when something seems impossible, stop and think: it's unlikely there is a global conspiracy against you, you're just looking in the wrong place!

No comments:

Post a Comment