On 10/19/22 10:25 AM, Greg Kroah-Hartman wrote: > On Wed, Oct 19, 2022 at 08:06:26AM -0700, Hugh Dickins wrote: >> On Wed, 19 Oct 2022, Greg Kroah-Hartman wrote: >> >>> From: Yu Kuai <yukuai3@xxxxxxxxxx> >>> >>> [ Upstream commit 040b83fcecfb86f3225d3a5de7fd9b3fbccf83b4 ] >>> >>> There are two problems can lead to lost wakeup: >>> >>> 1) invalid wakeup on the wrong waitqueue: >>> >>> For example, 2 * wake_batch tags are put, while only wake_batch threads >>> are woken: >>> >>> __sbq_wake_up >>> atomic_cmpxchg -> reset wait_cnt >>> __sbq_wake_up -> decrease wait_cnt >>> ... >>> __sbq_wake_up -> wait_cnt is decreased to 0 again >>> atomic_cmpxchg >>> sbq_index_atomic_inc -> increase wake_index >>> wake_up_nr -> wake up and waitqueue might be empty >>> sbq_index_atomic_inc -> increase again, one waitqueue is skipped >>> wake_up_nr -> invalid wake up because old wakequeue might be empty >>> >>> To fix the problem, increasing 'wake_index' before resetting 'wait_cnt'. >>> >>> 2) 'wait_cnt' can be decreased while waitqueue is empty >>> >>> As pointed out by Jan Kara, following race is possible: >>> >>> CPU1 CPU2 >>> __sbq_wake_up __sbq_wake_up >>> sbq_wake_ptr() sbq_wake_ptr() -> the same >>> wait_cnt = atomic_dec_return() >>> /* decreased to 0 */ >>> sbq_index_atomic_inc() >>> /* move to next waitqueue */ >>> atomic_set() >>> /* reset wait_cnt */ >>> wake_up_nr() >>> /* wake up on the old waitqueue */ >>> wait_cnt = atomic_dec_return() >>> /* >>> * decrease wait_cnt in the old >>> * waitqueue, while it can be >>> * empty. >>> */ >>> >>> Fix the problem by waking up before updating 'wake_index' and >>> 'wait_cnt'. >>> >>> With this patch, noted that 'wait_cnt' is still decreased in the old >>> empty waitqueue, however, the wakeup is redirected to a active waitqueue, >>> and the extra decrement on the old empty waitqueue is not handled. >>> >>> Fixes: 88459642cba4 ("blk-mq: abstract tag allocation out into sbitmap library") >>> Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx> >>> Reviewed-by: Jan Kara <jack@xxxxxxx> >>> Link: https://lore.kernel.org/r/20220803121504.212071-1-yukuai1@xxxxxxxxxxxxxxx >>> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> >>> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> >> >> I have no authority on linux-block, but I'll say NAK to this one >> (and 517/862), and let Jens and Jan overrule me if they disagree. >> >> This was the first of several 6.1-rc1 commits which had given me lost >> wakeups never suffered before; was not tagged Cc stable; and (unless I've >> missed it on lore) never had AUTOSEL posted to linux-block or linux-kernel. > > Ok, thanks for the review. I'll drop both of the sbitmap.c changes and > if people report issues and want them back, I'll be glad to revisit them > then. Sorry for being late, did see Hugh respond to the original auto-select as well, and was surprised to see it moving forward after that. Let's please drop them for now. -- Jens Axboe