| Steve's
Networked Programming Newsletter |
Making Nets
Work May
2009
|
|
I read an interesting article recently lambasting the
Sockets API for holding back the advancement of good
networked applications. Although the article raised some
interesting points concerning networked application
performance, I think the author missed the mark on a
number of points. Lest you be led astray by such errors,
I discuss these below along with what I think really
holds back good networked applications.
As always, be sure to forward this note to other
people you work with to be sure they know what's
happening in the world of networked application
development.
|
Is the
Sockets API Sufficient for Today's
Needs?
|
I recently read an interestingly titled article "Whither
Sockets?: High bandwidth, low latency, and multihoming
challenge the sockets API" by George V.
Neville-Neil, Consultant. The premise of the article is
that shortcomings in the Sockets API design are
hindering networked application development's ability to
take advantage of today's advancements in networking
technology and performance. Mr. Neville-Neil does a nice
job of explaining some sources of performance
degradation in networked applications and why they are
more of an issue today than when the Sockets API was
created in 1982 (remember modems and 10Mb ether?)
However, I don't think he makes the case that Sockets is
unduly causing problems where it doesn't need
to.
Here are the main areas of performance
problems noted in the article:
- Repeated system
calls cross the kernel barrier. Each system
call an application makes to request something from
the OS kernel must, generally speaking, end up passing
arguments between user space and kernel space. This
barrier is relatively expensive to cross in terms of
CPU time. Since many networked applications involve
loops of calls to select(),
recv(),
and send(),
these calls become unnecessarily expensive.
Is this true? Yes and no.
It is true that system calls are more expensive than
non-system procedure calls. An operation that involves
virtual page table manipulation and interrupts is more
costly than one that doesn't. But there's no way to get
around the need to request service from the kernel and
it's not specific to Sockets so this is sort of a red
herring.
The use of select()-based
loops is very expensive, but it's not primarily the
result of the kernel barrier. It's because scanning and
manipulating the fd_set
structures is very time-consuming and, if not handled
carefully, can starve some sockets out. The time spent
handling fd_sets far outweighs the overhead of the
system calls themselves and does have very noticeable
affects on networked application performance. More on
this below...
- Memory
copying. The first rule of performance tuning
in many applications, not just networked ones, is:
don't copy data. Data copies are very expensive and
the more data that's copied, either in size or
multiplicity, the worse performance gets. Mr.
Neville-Neil correctly notes that there's no direct
way for a network device driver to take data from, or
place data in, a user-space memory area. The data path
is user -> kernel ->driver and back. The driver
and kernel can cooperate to avoid one copy, but the
user -> kernel copy often remains out of necessity
in the general case. What's the system call to do with
the stack-located memory area passed to a send() call,
for example? It has to be copied.
Again, the need for
copying is not specific to sockets. Any call that passes
data to or from the kernel ends up copying. The most
direct way you, as a networked application developer,
can avoid performance hits from copying is to avoid
copies once the data is under your
control.
To summarize my reaction, there
are performance dangers lurking in networked
applications based on Sockets. The ubiquity of Sockets
makes it very difficult to change that in a general and
portable fashion in the API. The range of systems which
offer Sockets is very wide and encompasses a variety of
architectures and capabilities. However, the
difficulties do have solutions. There are, after all,
many networked applications and systems that handle
incredible amounts of traffic with high performance. The
improvements have come from two directions:
- OS-specific extensions. Facilities such as
Windows's overlapped I/O and POSIX aio make it
possible to run concurrent operations on many sockets
without always going through a select() loop. Also,
since the memory areas involved are pre-specified to
the kernel, there is more opportunity for the kernel
to map the user memory to a directly accessible
location and avoid a memory copy. On the event-driven
approach, newer demultiplexing facilities such as
kqueue, Linux epoll and the Solaris Event Completion
Framework make it much easier to efficiently handle
many thousands of sockets.
- Higher-level toolkits that embody best practices.
Toolkits such as ACE make it easier to use calls such
as select()
as efficiently as possible in terms of both
performance and fairness. Additionally, they can hide
OS-specific performance enhancements and incompatible
APIs (such as those in the previous paragraph) in
portable and easy-to-use frameworks. More specialized
tools such as Apache Qpid allow developers to gain
higher performance in specialized applications without
even having to care about the OS-level
facilities.
So, there are performance issues to
be aware of when developing networked applications, and
today's networking technology has pushed the application
code back to the spotlight in terms of performance
tuning. Although the Sockets API is old and predates
much of today's network technology, it does allow access
to all that's needed in a portable and ubiquitous way.
Modern OS facilities provide high-performance mechanisms
to access sockets in ways that mesh with the rest of the
architecture, and modern toolkits resolve much of the
complexity and make OS-private facilities available in a
portable and easy-to-use fashion. The venerable Sockets
API is definitely not dead. Use it wisely and your
results will be
fantastic. |
How Do
You Evaluate ACE Support Providers?
|
Have you ever wondered if you are taking best
advantage of ACE's power and flexibility? Ever have a
question you couldn't find the answer to? How about a
bug? Have any of these problems slowed down your
project, or even delayed a deadline? If you run
into a situation in the future where you may want ACE
support, it would be great if you already knew the best
provider for you and your team. To do that, you have to
start looking now. To evaluate the best provider for
you, you can start with our new white paper "8 Essential
Questions to Ask Your Potential ACE Support Service
Provider". Download your copy today at http://www.riverace.com/support8qs.htm.
|
| Do You
Need Help Designing Your Next System? |
Nobody
has to tell you that designing a well-formed, efficient,
maintainable networked application is hard. You've had
to deal with it. The problem is that networking
functionality is usually in a supporting role to your
system's main purposes, and your skills and experience
are much better used to focus on specific business and
technology issues. It may make more sense to bring in
seasoned expertise to help design a solid networking
base in your next system.
I've helped many
companies get great networked applications built - I may
be able to help you as well. Let's talk and see if I can
help take care of the networking, and let you focus on
applying your expertise and experience to the business
features that'll really help your system stand
out.
Call me at 508-541-9180 or email me at shuston@riverace.com.
| |
|
If you have any ideas for areas of networked
programming you'd like to hear about in future issues,
please email me with your suggestions. In the meantime,
keep those nets working!
Sincerely,
Steve Huston Riverace
Corporation
| | |
| |