On Sun, Sep 04, 2022 at 06:39:14AM -0600, Jens Axboe wrote: > On 9/1/22 10:43 AM, Jens Axboe wrote: > > On Thu, 25 Aug 2022 07:53:12 -0700, Keith Busch wrote: > >> From: Keith Busch <kbusch@xxxxxxxxxx> > >> > >> Batched completions can clear multiple bits, but we're only decrementing > >> the wait_cnt by one each time. This can cause waiters to never be woken, > >> stalling IO. Use the batched count instead. > >> > >> > >> [...] > > > > Applied, thanks! > > > > [1/1] sbitmap: fix batched wait_cnt accounting > > commit: 16ede66973c84f890c03584f79158dd5b2d725f5 > > This is causing CPU stalls for me running make -j256 with the source > hosted on an ATA device with QD=32. It's not running with a scheduler. > It just goes spammy on most/all CPUs so hard to get a real trace out of > it, but it looks like we're just looping forever off > sbitmap_queue_wake_up(). > > I'm going to revert this one for now until we can investigate what is > going on here. I was able to reproduce this without much trouble. I think it needs to restore the wait_cnt if we're racing with wait_active. I think the problem even exists without this patch ([1]), but you'd be unlikely to hit it decrementing wait_cnt just one at a time when the wait_batch is > 1. The diff on top of this patch should fix it: --- - if (!waitqueue_active(&ws->wait)) + if (!waitqueue_active(&ws->wait)) { + atomic_add(nr, &ws->wait_cnt); return true; + } -- [1] https://lore.kernel.org/linux-block/Yxe7V3yfBcADoYLE@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/T/#t