Steve HustonRiverace Corporation
Steve's Networked Programming Newsletter
Making Nets Work
May 2009

I read an interesting article recently lambasting the Sockets API for holding back the advancement of good networked applications. Although the article raised some interesting points concerning networked application performance, I think the author missed the mark on a number of points. Lest you be led astray by such errors, I discuss these below along with what I think really holds back good networked applications.

As always, be sure to forward this note to other people you work with to be sure they know what's happening in the world of networked application development.

In This Issue
Is the Sockets API Sufficient for Today's Needs?
How Do You Evaluate ACE Support Providers?
Did Your Last Project Run Late? Want to Prevent That?
Is the Sockets API Sufficient for Today's Needs?
I recently read an interestingly titled article "Whither Sockets?: High bandwidth, low latency, and multihoming challenge the sockets API" by George V. Neville-Neil, Consultant. The premise of the article is that shortcomings in the Sockets API design are hindering networked application development's ability to take advantage of today's advancements in networking technology and performance. Mr. Neville-Neil does a nice job of explaining some sources of performance degradation in networked applications and why they are more of an issue today than when the Sockets API was created in 1982 (remember modems and 10Mb ether?) However, I don't think he makes the case that Sockets is unduly causing problems where it doesn't need to.

Here are the main areas of performance problems noted in the article:
  • Repeated system calls cross the kernel barrier. Each system call an application makes to request something from the OS kernel must, generally speaking, end up passing arguments between user space and kernel space. This barrier is relatively expensive to cross in terms of CPU time. Since many networked applications involve loops of calls to select(), recv(), and send(), these calls become unnecessarily expensive.
Is this true? Yes and no. It is true that system calls are more expensive than non-system procedure calls. An operation that involves virtual page table manipulation and interrupts is more costly than one that doesn't. But there's no way to get around the need to request service from the kernel and it's not specific to Sockets so this is sort of a red herring.

The use of select()-based loops is very expensive, but it's not primarily the result of the kernel barrier. It's because scanning and manipulating the fd_set structures is very time-consuming and, if not handled carefully, can starve some sockets out. The time spent handling fd_sets far outweighs the overhead of the system calls themselves and does have very noticeable affects on networked application performance. More on this below...
  • Memory copying. The first rule of performance tuning in many applications, not just networked ones, is: don't copy data. Data copies are very expensive and the more data that's copied, either in size or multiplicity, the worse performance gets. Mr. Neville-Neil correctly notes that there's no direct way for a network device driver to take data from, or place data in, a user-space memory area. The data path is user -> kernel ->driver and back. The driver and kernel can cooperate to avoid one copy, but the user -> kernel copy often remains out of necessity in the general case. What's the system call to do with the stack-located memory area passed to a send() call, for example? It has to be copied.
Again, the need for copying is not specific to sockets. Any call that passes data to or from the kernel ends up copying. The most direct way you, as a networked application developer, can avoid performance hits from copying is to avoid copies once the data is under your control.

To summarize my reaction, there are performance dangers lurking in networked applications based on Sockets. The ubiquity of Sockets makes it very difficult to change that in a general and portable fashion in the API. The range of systems which offer Sockets is very wide and encompasses a variety of architectures and capabilities. However, the difficulties do have solutions. There are, after all, many networked applications and systems that handle incredible amounts of traffic with high performance. The improvements have come from two directions:
  1. OS-specific extensions. Facilities such as Windows's overlapped I/O and POSIX aio make it possible to run concurrent operations on many sockets without always going through a select() loop. Also, since the memory areas involved are pre-specified to the kernel, there is more opportunity for the kernel to map the user memory to a directly accessible location and avoid a memory copy. On the event-driven approach, newer demultiplexing facilities such as kqueue, Linux epoll and the Solaris Event Completion Framework make it much easier to efficiently handle many thousands of sockets.
  2. Higher-level toolkits that embody best practices. Toolkits such as ACE make it easier to use calls such as select() as efficiently as possible in terms of both performance and fairness. Additionally, they can hide OS-specific performance enhancements and incompatible APIs (such as those in the previous paragraph) in portable and easy-to-use frameworks. More specialized tools such as Apache Qpid allow developers to gain higher performance in specialized applications without even having to care about the OS-level facilities.
So, there are performance issues to be aware of when developing networked applications, and today's networking technology has pushed the application code back to the spotlight in terms of performance tuning. Although the Sockets API is old and predates much of today's network technology, it does allow access to all that's needed in a portable and ubiquitous way. Modern OS facilities provide high-performance mechanisms to access sockets in ways that mesh with the rest of the architecture, and modern toolkits resolve much of the complexity and make OS-private facilities available in a portable and easy-to-use fashion. The venerable Sockets API is definitely not dead. Use it wisely and your results will be fantastic.
How Do You Evaluate ACE Support Providers?
puzzle pieces
Have you ever wondered if you are taking best advantage of ACE's power and flexibility? Ever have a question you couldn't find the answer to? How about a bug?

Have any of these problems slowed down your project, or even delayed a deadline?

If you run into a situation in the future where you may want ACE support, it would be great if you already knew the best provider for you and your team. To do that, you have to start looking now. To evaluate the best provider for you, you can start with our new white paper "8 Essential Questions to Ask Your Potential ACE Support Service Provider". Download your copy today at http://www.riverace.com/support8qs.htm.
Do You Need Help Designing Your Next System?
Nobody has to tell you that designing a well-formed, efficient, maintainable networked application is hard. You've had to deal with it. The problem is that networking functionality is usually in a supporting role to your system's main purposes, and your skills and experience are much better used to focus on specific business and technology issues. It may make more sense to bring in seasoned expertise to help design a solid networking base in your next system.

I've helped many companies get great networked applications built - I may be able to help you as well. Let's talk and see if I can help take care of the networking, and let you focus on applying your expertise and experience to the business features that'll really help your system stand out.

Call me at 508-541-9180 or email me at shuston@riverace.com.
If you have any ideas for areas of networked programming you'd like to hear about in future issues, please email me with your suggestions. In the meantime, keep those nets working!
 
Sincerely,
 

Steve Huston
Riverace Corporation
Join Our Mailing List