在 2024/6/6 17:52, Ming Lei 写道:
On Thu, Jun 06, 2024 at 04:05:33PM +0800, Li Nan wrote:
在 2024/6/6 12:48, Changhui Zhong 写道:
[...]
Hi Changhui,
The hang is actually expected because recovery fails.
Please pull the latest ublksrv and check if the issue can still be
reproduced:
https://github.com/ublk-org/ublksrv
BTW, one ublksrv segfault and two test cleanup issues are fixed.
Thanks,
Ming
Hi,Ming and Nan
after applying the new patch and pulling the latest ublksrv,
I ran the test for 4 hours and did not observe any task hang.
the test results looks good!
Thanks,
Changhui
.
Thanks for you test!
However, I got a NULL pointer dereference bug with ublksrv. It is not
BTW, your patch isn't related with generic/004 which won't touch
recovery code path.
introduced by this patch. It seems io was issued after deleting disk. And
it can be reproduced by:
while true; do make test T=generic/004; done
We didn't see that when running such test with linus tree, and usually
Changhui run generic test for hours.
[ 1524.286485] running generic/004
[ 1529.110875] blk_print_req_error: 109 callbacks suppressed
...
[ 1541.171010] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 1541.171734] #PF: supervisor write access in kernel mode
[ 1541.172271] #PF: error_code(0x0002) - not-present page
[ 1541.172798] PGD 0 P4D 0
[ 1541.173065] Oops: Oops: 0002 [#1] PREEMPT SMP
[ 1541.173515] CPU: 0 PID: 43707 Comm: ublk Not tainted
6.9.0-next-20240523-00004-g9bc7e95c7323 #454
[ 1541.174417] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.16.1-2.fc37 04/01/2014
[ 1541.175311] RIP: 0010:io_fallback_tw+0x252/0x300
This one looks one io_uring issue.
Care to provide which line of source code points to by 'io_fallback_tw+0x252'?
gdb> l *(io_fallback_tw+0x252)
(gdb) list * io_fallback_tw+0x252
0xffffffff81d79dc2 is in io_fallback_tw
(./arch/x86/include/asm/atomic64_64.h:25).
20 __WRITE_ONCE(v->counter, i);
21 }
22
23 static __always_inline void arch_atomic64_add(s64 i, atomic64_t *v)
24 {
25 asm volatile(LOCK_PREFIX "addq %1,%0"
26 : "=m" (v->counter)
27 : "er" (i), "m" (v->counter) : "memory");
28 }
The corresponding code is:
io_fallback_tw
percpu_ref_get(&last_ctx->refs);
I have the vmcore of this issue. If you have any other needs, please let me
know.
--
Thanks,
Nan