Re: [patch,v2 00/10] make I/O path allocations more numa-friendly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/06/12 16:41, Elliott, Robert (Server Storage) wrote:
It's certainly better to tie them all to one node then let them be
> randomly scattered across nodes; your 6% observation may simply be
> from that.

How do you think these compare, though (for structures that are per-IO)?
- tying the structures to the node hosting the storage device
- tying the structures to the node running the application

The latter means that PCI Express traffic must spend more time winding
> its way through the CPU complex. For example, the Memory Writes to the
> OQ and to deliver the MSI-X interrupt take longer to reach the destination
> CPU memory, snooping the other CPUs along the way. Once there, though,
> application reads should be faster.

We're trying to design the SCSI Express standards (SOP and PQI) to be
> non-uniform memory and non-uniform I/O friendly. Some concepts we've included:
- one driver thread per CPU core
- each driver thread processes IOs from application threads on that CPU core
- each driver thread has its own inbound queue (IQ) for command submission
- each driver thread has its own outbound queue (OQ) for status reception
- each OQ has its own MSI-X interrupt that is directed to that CPU core

This should work best if the application threads also run on the right
> CPU cores.  Most OSes seem to lack a way for an application to determine
> that its IOs will be heading to an I/O device on another node, and to
> request (but not demand) that its threads run on that closer node.
> Thread affinities seem to be treated as hard requirements rather than
> suggestions, which causes all applications doing IOs to converge on that
> poor node and leave the others unused.  There's a tradeoff between the
> extra latency vs. the extra CPU processing power and memory bandwidth.

The first five patches in this series already provide an infrastructure that allows to tie the data structures needed for I/O to the node running the application. That can be realized by passing the proper NUMA node to scsi_host_alloc_node(). The only part that is missing is a user interface for specifying that node. If anyone could come up with a proposal for adding such a user interface without having to reimplement it in every LLD that would be great.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux