Re: (subset) [PATCH 00/11] remove aux CQE caches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/17/24 21:32, Jens Axboe wrote:
On 3/17/24 3:29 PM, Pavel Begunkov wrote:
On 3/17/24 21:24, Jens Axboe wrote:
On 3/17/24 2:55 PM, Pavel Begunkov wrote:
On 3/16/24 13:56, Ming Lei wrote:
On Sat, Mar 16, 2024 at 01:27:17PM +0000, Pavel Begunkov wrote:
On 3/16/24 11:52, Ming Lei wrote:
On Fri, Mar 15, 2024 at 04:53:21PM -0600, Jens Axboe wrote:

...

The following two error can be triggered with this patchset
when running some ublk stress test(io vs. deletion). And not see
such failures after reverting the 11 patches.

I suppose it's with the fix from yesterday. How can I
reproduce it, blktests?

Yeah, it needs yesterday's fix.

You may need to run this test multiple times for triggering the problem:

Thanks for all the testing. I've tried it, all ublk/generic tests hang
in userspace waiting for CQEs but no complaints from the kernel.
However, it seems the branch is buggy even without my patches, I
consistently (5-15 minutes of running in a slow VM) hit page underflow
by running liburing tests. Not sure what is that yet, but might also
be the reason.

Hmm odd, there's nothing in there but your series and then the
io_uring-6.9 bits pulled in. Maybe it hit an unfortunate point in the
merge window -git cycle? Does it happen with io_uring-6.9 as well? I
haven't seen anything odd.

Need to test io_uring-6.9. I actually checked the branch twice, both
with the issue, and by full recompilation and config prompts I assumed
you pulled something in between (maybe not).

And yeah, I can't confirm it's specifically an io_uring bug, the
stack trace is usually some unmap or task exit, sometimes it only
shows when you try to shutdown the VM after tests.

Funky. I just ran a bunch of loops of liburing tests and Ming's ublksrv
test case as well on io_uring-6.9 and it all worked fine. Trying
liburing tests on for-6.10/io_uring as well now, but didn't see anything
the other times I ran it. In any case, once you repost I'll rebase and
then let's see if it hits again.

Did you run with KASAN enabled

Yes, it's a debug kernel, full on KASANs, lockdeps and so

--
Pavel Begunkov




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux