RE: [PATCH v2 12/12] IB/srp: Add multichannel support

"Elliott, Robert (Server Storage)" <Elliott@xxxxxx> · Mon, 3 Nov 2014 01:46:33 +0000

> -----Original Message-----
> From: Sagi Grimberg [mailto:sagig@xxxxxxxxxxxxxxxxxx]
> Sent: Sunday, November 02, 2014 7:03 AM
> To: Bart Van Assche; Christoph Hellwig
> Cc: Jens Axboe; Sagi Grimberg; Sebastian Parschauer; Elliott, Robert
> (Server Storage); Ming Lei; linux-scsi@xxxxxxxxxxxxxxx; linux-rdma
> Subject: Re: [PATCH v2 12/12] IB/srp: Add multichannel support
> 
...
> IMHO, this is not iSER specific issue, it is easily indicated from the
> code that a specific workload SRP will poll recv completion queue
> forever in an interrupt context.
> 
> I encountered this issue on a virtual guest in a high workload (80+
> sessions with heavy traffic on all) because qemu smp_affinity setting
> was broken (might still be, didn't check that for a while). This caused
> all completion vectors to fire interrupts to core 0 causing a high
> events contention on a single event queue (causing lockup situations
> and starvation of other CQs). Using more completion queues will enhance
> this situation.
> 
> I think running multichannel code when all MSIX vectors affinity are
> directed to a single CPU can invoke what I'm talking about.

That's not an SRP specific problem either.  If you ask just one CPU to
service interrupts and block layer completions for submissions from lots
of other CPUs, it's bound to become overloaded.

Setting rq_affinity=2 helps quite a bit for the block layer completion
work.  This patch proposed making that the default for blk-mq:
	https://lkml.org/lkml/2014/9/9/931

For SRP interrupt processing, irqbalance recently changed its default 
to ignore the affinity_hint; you now need to pass an option to honor
the hint, or provide a policy script to do so for selected irqs.  For
multi-million IOPS workloads, irqbalance takes far too long to reroute
them based on activity; you're likely to overload a CPU with 100% 
hardirq processing, creating self-detected stalls for the submitting
processes on that CPU and other problems.  Sending interrupts back 
to the submitting CPU provides self-throttling.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html