[RFC] Asynchronous IPI and e1000 Multiple Queues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



With our latest submittal of e1000 patches, we introduced code to enable
multiple transmit and receive queues for the 82571 adapter.  All of the
code is wrapped with CONFIG_E1000_MQ with the intention that it not be
enabled until the patchset within this email is reviewed and (in some
form) released.  So we'd like to gather some feedback on this patchset
and get an idea if this is the correct approach.

Multiple queues serve a couple purposes (probably more): Receive-Side
Scaling - Share the interrupt processing across multiple CPUs.  We've
got hyper-threaded/multi-core processors, let's use them; Priority
Queuing (e.g., TOS) - Queue 0 transmits X more/less packets than queue 1
due to <insert arbitration scheme here>.  With the single-queue (qdisc)
implementation for transmits, it doesn't make multiple Tx queues all
that exciting, and it means the arbitration scheme resides in the
driver, but it's possible that could change over time.  So most benefits
of multiple queues are seen on receives.  NAPI helps this effort (with
per-CPU processing), but this means netif_rx_schedule is CPU-bound.  So
we needed a way to schedule receive processing per-CPU context.  The one
way we came up with was designing a new asynchronous IPI vector.  The
helper function is exported to drivers to queue up the work, then inform
the other CPUs of this pending work.

In smp_call_async.2.6.13.patch, we create an asynchronous IPI with an
associated queue.  Drivers fill out the call_async_data_struct and call
the "smp_call_function"-like routine smp_call_async_mask.  If the mask
contains the current running CPU, it simply calls the routine specified
in the data struct, otherwise add the task to the call_async_queue and
send an IPI to all CPUs in the mask.  The async interrupt simply
processes each task in the queue.

Each CPU can now take care of its own work (essentially calling
netif_rx_schedule) without messy locks around the NAPI threads.

In e1000_mq_Kconfig.patch, we simply add the option to enable multiple
queues during kernel configuration.

Is this the right approach?  Any input, fixes and testing would be
greatly appreciated.

Thanks,
-Jeb

 <<e1000_mq_Kconfig.patch>>  <<smp_call_async.2.6.13.patch>> 

Attachment: e1000_mq_Kconfig.patch
Description: e1000_mq_Kconfig.patch

Attachment: smp_call_async.2.6.13.patch
Description: smp_call_async.2.6.13.patch


[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux 802.1Q VLAN]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Git]     [Bugtraq]     [Yosemite News and Information]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux PCI]     [Linux Admin]     [Samba]

  Powered by Linux