UDP performance questions

Grant Taylor <gtaylor@sw.starentnetworks.com> · Fri, 19 Oct 2001 16:39:25 -0400

I'm playing with UDP a bit, and have a few questions about how my
program could be made faster.  Obviously I'd prefer userspace changes,
but I'm running low on possibilities there: my program is 70% kernel
time.

Background:

  SMP x86 with two processors with a 2.4.4 kernel (yes, old, but udp
  code and performance isn't particularly changed on 2.4.10).

  My test app consists of three processes on one machine: one server
  and two clients.  The clients are sending ~100 byte packets to the
  server using sendmsg(), and the server picks them up with recvmsg()
  and responds immediately with same-sized packets.  The clients keep
  100 requests outstanding by sending an initial flurry and then a new
  request each time an answer shows up.  The userspace profile of
  these programs (with mcount, etc computed out) all show sendmsg at
  45% and recvmsg at 20%.  They occasionally poll; poll is at 2%.

  The sockets are not connected (there are ~500 peers, and polling on
  a mess of sockets would be poor).  The sockets have checksums turned
  off (local switched ethernet and loopback only).

  In actual numbers, a dual P3-850 does 76000 packets per second in
  this scenario, for 76000 each of sendmsg and recvmsg calls.

  I'm told that this is atrociously slow, but I've been unable to find
  comparable numbers for other systems, so I'm just pondering ways to
  make it faster.

Below is the top part of the kernel profile for a loopback run,
converted from ticks to percents.  The big mystery to me is why
udp_recvmsg is so busy.  The only functions called are
skb_free_datagram (a real func elsewhere in the profile),
skb_copy_datagram_iovec (real), skb_recv_datagram (real),
sock_recv_timestamp (inline if(blah) {assignment}).  There's a little
memset of the zero part of the addr.  And a few ifs and assignments.
Am I missing something big in udp_recvmsg?  I'm calling it
nonblocking, and there's always something to read; none of the
exception cases should be happening.  It's called the same number of
times as everything else here; there aren't a million EAGAINs
happening or anything.

I've demonstrated a noticable speedup using a multiframe syscall in
another protocol, but I don't really want to go to all that trouble.
Even if I do all the fiddles mentioned below the net gain would be
maybe 5% wall time if I'm lucky.  Gratuitous rearranging and inlining
might do me another 5%, but even given this unpleasantly flat profile
I don't really want to go there.

What can I do?

  6.15% udp_recvmsg              # !?
  5.38% ip_build_xmit            # inlined skb fiddling, iph csum, etc
  4.57% __generic_copy_to_user   # life
  4.45% udp_rcv                  # no "most recent socket" cache
  4.32% udp_queue_rcv_skb        # spin_lock_irqsave/restore + trivial list op
  4.31% ip_rcv                   # iph checksum check on loopback
  3.97% ip_route_output_key      # could cache route for unconnected sockets
  3.43% sock_alloc_send_skb
  3.38% dev_queue_xmit
  3.35% udp_sendmsg
  3.13% ip_output
  2.87% net_rx_action
  2.76% __generic_copy_from_user # life
  2.62% skb_release_data
  2.61% __kfree_skb
  2.53% do_gettimeofday
  2.45% sock_def_write_space     # locks, waits, etc
  1.94% skb_recv_datagram
  1.91% sock_def_readable        # locks, waits, etc
  1.76% system_call
  1.70% kmalloc                  # no stack of free skb elements?
  1.67% fget                     # is the last-used fd cached?
  1.59% skb_copy_datagram_iovec
  1.42% loopback_xmit
  1.30% kfree
  1.28% netif_rx
  1.25% udp_v4_lookup_longway
  1.18% sys_recvmsg
  1.05% verify_iovec
  0.99% alloc_skb                # not inline?

-- 
Grant Taylor - http://www.picante.com/~gtaylor/
-
: send the line "unsubscribe linux-net" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html