Monday, February 22, 2010

TCP Offload Engines

I happened across an interesting report of a problem that was ultimately determined to be due to interactions with a TCP Offload Engine.

The problem report itself was fascinating, partly because I occasionally see very odd network behaviors that I don't understand (unfortunately I haven't had as much luck making them be reproducible and resolving them), and partly because I had never heard of TCP Offload Engines before.

So I did a bit of searching and found some quite interesting information. There are definitely a variety of sources, including some very respectable ones, which describe the potential of TCP Offload in glowing terms.

However, there are also some very skeptical viewpoints, such as this article describing in detail why most Linux systems don't support TOE. I thought that the most telling point in this document was the observation-from-history about the eternal tradeoff between custom hardware solutions and general software solutions:

Each TOE NIC has a limited lifetime of usefulness, because system hardware rapidly catches up to TOE performance levels, and eventually exceeds TOE performance levels. We saw this with 10mbit TOE, 100mbit TOE, gigabit TOE, and soon with 10gig TOE.

Also, the essay that I linked to at the start of this post, which originally got me interested in TOE, refers to this support document, which mentions that:

  • TCP/IP Offload has a problem with the Window Scaling feature. This problem typically occurs when you communicate with a Windows Vista-based computer. Windows Vista uses the Window Scaling feature.

  • Some TCP/IP Offload-enabled network adapters do not send TCP keep-alive messages. However, Exchange servers use TCP keep-alive messages to clean up inactive client sessions.

  • The TCP/IP Offload-enabled network adapter may consume lots of nonpaged pool memory. This may cause other problems in the operating system.

  • In some cases, the TCP/IP Offload-enabled network adapter may request large blocks of contiguous memory. This makes the computer stop responding when it tries to free the memory.

It seems like this is a fairly controversial bit of technology, with rather wide-ranging opinions on whether the technology is a benefit or a hindrance. Modern systems continue to become more complex, and evaluating their success or failure becomes more complex as well.

Many years ago, when I worked in the database world, we had a variety of partners who were trying to implement various bits of the DBMS technology in hardware, in the hopes that their hardware-augmented systems would outperform the pure software solutions. In the DBMS world, it seems, we learned this lesson a long time ago:

There was overwhelming sentiment that research on hardware data base machines was unlikely to produce significant results. Some people put it more strongly and said they did not want to read any more papers on hardware filtering or hardware sorting. The basic point they made was that general purpose CPUs are falling in price so fast that one should do DBMS functions in software on a general purpose machine rather than on custom designed hardware. Put differently, nobody was optimistic that custom hardware could compete with general purpose CPUs any time in the near future.

It will be interesting, now that I'm aware of it, to keep an eye on this TOE technology, and see if it suffers the same fate that the DBMS custom hardware technology did, 25 years ago.

No comments:

Post a Comment