RE: [PATCH v2 12/12] IB/srp: Add multichannel support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Sagi Grimberg [mailto:sagig@xxxxxxxxxxxxxxxxxx]
> Sent: Tuesday, November 04, 2014 6:15 AM
> To: Bart Van Assche; Elliott, Robert (Server Storage); Christoph Hellwig
> Cc: Jens Axboe; Sagi Grimberg; Sebastian Parschauer; Ming Lei; linux-
> scsi@xxxxxxxxxxxxxxx; linux-rdma
> Subject: Re: [PATCH v2 12/12] IB/srp: Add multichannel support
> 
...
> I think that Rob and I are not talking about the same issue. In
> case only a single core is servicing interrupts it is indeed expected
> that it will spend 100% in hard-irq, that's acceptable since it is
> pounded with completions all the time.
> 
> However, I'm referring to a condition where SRP will spend infinite
> time servicing a single interrupt (while loop on ib_poll_cq that never
> drains) which will lead to a hard lockup.
> 
> This *can* happen, and I do believe that with an optimized IO path
> it is even more likely to.

If the IB completions/interrupts are only for IOs submitted on this
CPU, then the CQ will eventually drain, because this CPU is not 
submitting anything new while stuck in the loop.

This can become bursty, though - submit a lot of IOs, then be busy
completing all of them and not submitting more, resulting in the 
queue depth bouncing from 0 to high to 0 to high.  I've seen
that with both hpsa and mpt3sas drivers.  The fio options
iodepth_batch, iodepth_batch_complete, and iodepth_low
can amplify and reduce that effect (using libaio).

I haven't found a good way for the LLD ISRs and the block
layer completion code to decide to yield the CPU based on how
much time they are taking - that would almost qualify as
a realtime kernel feature.  If you compile with
CONFIG_IRQ_TIME_ACCOUNTING, the kernel does keep track
of that information; perhaps that could be exported so
modules can use it?

---
Rob Elliott, HP Server Storage

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux