Re: [PATCH] SCSI: don't get target/host busy_count in scsi_mq_get_budget()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/07/2017 03:34 PM, Bart Van Assche wrote:
> On Tue, 2017-11-07 at 15:06 -0700, Jens Axboe wrote:
>> Just to keep everyone in the loop, this bug is not new to
>> for-4.15/block, nor is it new to the current 4.14-rc or 4.13. So it's
>> probably different to what Bart is hitting, but it's a bug none the
>> less...
> 
> Hello Jens,
> 
> There are several reasons why I think that patch "blk-mq: don't handle
> TAG_SHARED in restart" really should be reverted:
> * That patch is based on the assumption that only the SCSI driver uses shared
>   tags. That assumption is not correct. null_blk and nvme also use shared tags.
> * As my tests have shown, the algorithm for restarting queues based on the
>   SCSI starved list is flawed. So using that mechanism instead of the blk-mq
>   shared queue restarting algorithm is wrong.
> * We are close to the merge window. It is too late for trying to fix the
>   "blk-mq: don't handle TAG_SHARED in restart" patch.
> 
> My proposal is to make sure that what will be sent to Linus during the v4.15
> merge window works reliably. That means using the v4.13/v4.14 algorithm for
> queue restarting which is an algorithm that is trusted by the community. If
> Roman Penyaev's patch could get applied that would be even better.

I'm fine with reverting a single patch, that's still a far cry from the
giant list. I'd rather get a fix in though, if at all possible. The code
it removed wasn't exactly the fastest or prettiest solution. But the
most important part is that we have something that works. If you have a
test case that is runnable AND reproduces the problem, I'd love to have
it. I've seen comments to that effect spread over several messages,
would be nice to have it summarized and readily available for all that
want to work on it.

The issue above is NOT a new bug, I ran into it by accident trying to
reproduce your case. Debugging that one right now, hopefully we'll have
some closure on that tomorrow and we can move forward on the shared tag
restart/loop. It smells like a TAG_WAITING race, or a restart race.

-- 
Jens Axboe




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux