Re: [PATCH V2] blk-mq: avoid to starve tag allocation after allocation process migrates

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 24, 2018 at 1:48 AM, Omar Sandoval <osandov@xxxxxxxxxxx> wrote:
> On Wed, May 23, 2018 at 05:32:31PM +0800, Ming Lei wrote:
>> On Tue, May 22, 2018 at 09:59:17PM -0600, Jens Axboe wrote:
>> > On 5/19/18 1:44 AM, Ming Lei wrote:
>> > > When the allocation process is scheduled back and the mapped hw queue is
>> > > changed, do one extra wake up on orignal queue for compensating wake up
>> > > miss, so other allocations on the orignal queue won't be starved.
>> > >
>> > > This patch fixes one request allocation hang issue, which can be
>> > > triggered easily in case of very low nr_request.
>> >
>> > Trying to think of better ways we can fix this, but I don't see
>> > any right now. Getting rid of the wake_up_nr() kills us on tons
>> > of tasks waiting.
>>
>> I am not sure if I understand your point, but this issue isn't related
>> with wake_up_nr() actually, and it can be reproduced after reverting
>> 4e5dff41be7b5201c1c47c (blk-mq: improve heavily contended tag case).
>>
>> All tasks in current sbq_wait_state may be scheduled to other CPUs, and
>> there may still be tasks waiting for allocation from this sbitmap_queue,
>> and the root cause is about cross-queue allocation, as you said,
>> there are too many queues, :-)
>
> I don't follow. Your description of the problem was that we have two
> waiters and only wake up one, which doesn't in turn allocate and free a
> tag and wake up the second waiter. Changing it back to wake_up_nr()
> eliminates that problem. And if waking up everything doesn't fix it, how
> does your fix of waking up a few extra tasks fix it?

What matters is that this patch wakes up the previous sbq, let's see if
from another view:

1) still 2 hw queues, nr_requests are 2, and wake_batch is one

2) there are 3 waiters on hw queue 0

3) two in-flight requests in hw queue 0 are completed, and only two waiters
of 3 are waken up because of wake_batch, but both the two waiters can be
scheduled to another CPU and cause to switch to hw queue 1

4) then the 3rd waiter will wait for ever, since no in-flight request
is in hw queue
0 any more.

5) this patch fixes it by the fake wakeup when waiter is scheduled to another
hw queue

The issue can be understood a bit easier if we just forget sbq_wait_state and
focus on sbq, :-)

Thanks,
Ming Lei



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux