On Mon, Feb 10, 2025 at 3:08 PM David Howells <dhowells@xxxxxxxxxx> wrote: > Can you apply the attached patch to your kernel, and then run with: > > echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_read/enable > echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_rreq/enable > echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_sreq/enable > echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_failure/enable > echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_donate/enable > echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_progress/enable > > enabled. If you can capture the trace output (and compress it!), that should > hopefully help debug this. I needed some more tricks to be able to capture the trace of a crashing kernel, but here it is (a minimal trace because I now figured out how to reproduce the bug with a single command on a disabled machine). Additionally, here you have a file with a dump of *subreq and *req (of R=8). I have added additional fields "start0" and "len0" which contain the values initially set on these. Kernel output just before the crash: 3,1644,103679537,-;netfs: Can't donate prior to front 4,1645,103679550,-;R=00000007[6] s=187000-1bffff 0/39000/39000 4,1646,103679562,-;folio: 180000-1bffff 4,1647,103679568,-;donated: prev=0 next=0 4,1648,103679574,-;s=187000 av=39000 part=39000 3,1649,103708094,-;netfs: Can't donate prior to front 4,1650,103708105,-;R=00000008[7] s=1c6000-1fffff 0/3a000/3a000 4,1651,103708117,-;folio: 1c0000-1fffff 4,1652,103708123,-;donated: prev=0 next=0 4,1653,103708129,-;s=1c6000 av=3a000 part=3a000 The trace file contains just the R=8 request, but I guess this one is enough for you. Max
kworker/u193:5-3716 [003] ..... 103.651806: netfs_sreq: R=00000008[1] WRIT PREP f=00 s=c0000 0/0 e=0 kworker/u193:5-3716 [003] ..... 103.651808: netfs_sreq: R=00000008[1] WRIT SUBMT f=100 s=c0000 0/40000 e=0 kworker/3:2-4269 [003] ..... 103.652314: netfs_sreq: R=00000008[1] WRIT TERM f=100 s=c0000 40000/40000 e=0 kworker/u193:5-3716 [003] ..... 103.652343: netfs_rreq: R=00000008 2C COLLECT f=2020 kworker/u193:5-3716 [003] ..... 103.652349: netfs_sreq: R=00000008[1] WRIT FREE f=00 s=c0000 40000/40000 e=0 kworker/u193:5-3716 [003] ..... 103.652357: netfs_rreq: R=00000008 2C WR-DONE f=2020 kworker/u193:5-3716 [003] ..... 103.652360: netfs_rreq: R=00000008 2C WAKE-IP f=2020 kworker/u193:5-3716 [003] ..... 103.652362: netfs_rreq: R=00000008 2C FREE f=2000 cp-4373 [038] ..... 103.724132: netfs_read: R=00000008 READAHEAD c=00000002 ni=76464c3 s=1c0000 l=40000 sz=776749 cp-4373 [038] b.... 103.724170: netfs_sreq: R=00000008[1] ---- ADD f=00 s=1c0000 0/40000 e=0 cp-4373 [038] b.... 103.724470: netfs_sreq: R=00000008[2] ---- ADD f=00 s=1d0000 0/30000 e=0 cp-4373 [038] b.... 103.724753: netfs_sreq: R=00000008[3] ---- ADD f=00 s=1d4000 0/2c000 e=0 cp-4373 [038] ..... 103.724767: netfs_sreq: R=00000008[3] READ SUBMT f=00 s=1d4000 0/3000 e=0 cp-4373 [038] b.... 103.724845: netfs_sreq: R=00000008[4] ---- ADD f=00 s=1d7000 0/29000 e=0 cp-4373 [038] b.... 103.725119: netfs_sreq: R=00000008[5] ---- ADD f=00 s=1e7000 0/19000 e=0 cp-4373 [038] b.... 103.725389: netfs_sreq: R=00000008[6] ---- ADD f=00 s=1f1000 0/f000 e=0 cp-4373 [038] ..... 103.725403: netfs_sreq: R=00000008[6] READ SUBMT f=00 s=1f1000 0/3000 e=0 cp-4373 [038] b.... 103.725478: netfs_sreq: R=00000008[7] ---- ADD f=00 s=1f4000 0/c000 e=0 cp-4373 [038] b.... 103.725751: netfs_sreq: R=00000008[8] ---- ADD f=00 s=1fa000 0/6000 e=0 cp-4373 [038] ..... 103.725767: netfs_sreq: R=00000008[8] READ SUBMT f=00 s=1fa000 0/6000 e=0 <idle>-0 [038] ..s.. 103.729472: netfs_progress: R=00000008[06] s=1f1000 ct=0/3000 pa=3000/3000 sl=0 <idle>-0 [038] b.s.. 103.729482: netfs_donate: R=00000008[06] -> [07] to-next am=3000 <idle>-0 [038] b.s.. 103.729483: netfs_sreq: R=00000008[6] READ DON-N f=00 s=1f1000 3000/3000 e=0 <idle>-0 [038] ..s.. 103.729486: netfs_sreq: R=00000008[6] READ TERM f=10 s=1f1000 0/0 e=0 <idle>-0 [038] ..s.. 103.729487: netfs_sreq: R=00000008[6] READ FREE f=10 s=1f1000 0/0 e=0 <idle>-0 [038] ..s.. 103.735784: netfs_progress: R=00000008[03] s=1d4000 ct=0/3000 pa=3000/3000 sl=0 <idle>-0 [038] b.s.. 103.735792: netfs_donate: R=00000008[03] -> [04] to-next am=3000 <idle>-0 [038] b.s.. 103.735793: netfs_sreq: R=00000008[3] READ DON-N f=00 s=1d4000 3000/3000 e=0 <idle>-0 [038] ..s.. 103.735796: netfs_sreq: R=00000008[3] READ TERM f=10 s=1d4000 0/0 e=0 <idle>-0 [038] ..s.. 103.735798: netfs_sreq: R=00000008[3] READ FREE f=10 s=1d4000 0/0 e=0 <idle>-0 [038] ..s.. 103.736603: netfs_progress: R=00000008[08] s=1fa000 ct=0/6000 pa=6000/6000 sl=0 <idle>-0 [038] b.s.. 103.736612: netfs_donate: R=00000008[08] -> [07] tail-to-prev am=6000 <idle>-0 [038] b.s.. 103.736613: netfs_sreq: R=00000008[8] READ DON-P f=00 s=200000 0/0 e=0 <idle>-0 [038] ..s.. 103.736616: netfs_sreq: R=00000008[8] READ TERM f=10 s=200000 0/0 e=0 <idle>-0 [038] ..s.. 103.736618: netfs_sreq: R=00000008[8] READ FREE f=10 s=200000 0/0 e=0 kworker/u193:5-3716 [003] ..... 103.780260: netfs_progress: R=00000008[01] s=1c0000 ct=0/10000 pa=10000/10000 sl=0 kworker/u193:5-3716 [003] b.... 103.780269: netfs_donate: R=00000008[01] -> [02] to-next am=10000 kworker/u193:5-3716 [003] b.... 103.780271: netfs_sreq: R=00000008[1] DOWN DON-N f=01 s=1c0000 10000/10000 e=0 kworker/u193:5-3716 [003] ..... 103.780272: netfs_sreq: R=00000008[1] DOWN TERM f=11 s=1c0000 0/0 e=0 kworker/u193:5-3716 [003] ..... 103.780273: netfs_sreq: R=00000008[1] DOWN FREE f=11 s=1c0000 0/0 e=0 kworker/u193:5-3716 [003] b.... 103.781527: netfs_sreq: R=00000008[2] DOWN +DON f=01 s=1c0000 14000/14000 e=0 kworker/u193:5-3716 [003] ..... 103.781536: netfs_progress: R=00000008[02] s=1c0000 ct=0/14000 pa=14000/14000 sl=0 kworker/u193:5-3716 [003] b.... 103.781538: netfs_donate: R=00000008[02] -> [04] to-next am=14000 kworker/u193:5-3716 [003] b.... 103.781539: netfs_sreq: R=00000008[2] DOWN DON-N f=01 s=1c0000 14000/14000 e=0 kworker/u193:5-3716 [003] ..... 103.781540: netfs_sreq: R=00000008[2] DOWN TERM f=11 s=1c0000 0/0 e=0 kworker/u193:5-3716 [003] ..... 103.781541: netfs_sreq: R=00000008[2] DOWN FREE f=11 s=1c0000 0/0 e=0 kworker/u193:5-3716 [003] b.... 103.783277: netfs_sreq: R=00000008[4] DOWN +DON f=01 s=1c3000 24000/24000 e=0 kworker/u193:5-3716 [003] ..... 103.783286: netfs_progress: R=00000008[04] s=1c3000 ct=0/24000 pa=24000/24000 sl=0 kworker/u193:5-3716 [003] b.... 103.783288: netfs_donate: R=00000008[04] -> [05] to-next am=24000 kworker/u193:5-3716 [003] b.... 103.783289: netfs_sreq: R=00000008[4] DOWN DON-N f=01 s=1c3000 24000/24000 e=0 kworker/u193:5-3716 [003] ..... 103.783290: netfs_sreq: R=00000008[4] DOWN TERM f=11 s=1c3000 0/0 e=0 kworker/u193:5-3716 [003] ..... 103.783291: netfs_sreq: R=00000008[4] DOWN FREE f=11 s=1c3000 0/0 e=0 kworker/u193:5-3716 [003] b.... 103.784366: netfs_sreq: R=00000008[5] DOWN +DON f=01 s=1c3000 2e000/2e000 e=0 kworker/u193:5-3716 [003] ..... 103.784372: netfs_progress: R=00000008[05] s=1c3000 ct=0/2e000 pa=2e000/2e000 sl=0 kworker/u193:5-3716 [003] b.... 103.784374: netfs_donate: R=00000008[05] -> [07] to-next am=2e000 kworker/u193:5-3716 [003] b.... 103.784375: netfs_sreq: R=00000008[5] DOWN DON-N f=01 s=1c3000 2e000/2e000 e=0 kworker/u193:5-3716 [003] ..... 103.784376: netfs_sreq: R=00000008[5] DOWN TERM f=11 s=1c3000 0/0 e=0 kworker/u193:5-3716 [003] ..... 103.784377: netfs_sreq: R=00000008[5] DOWN FREE f=11 s=1c3000 0/0 e=0 kworker/u193:5-3716 [003] b.... 103.785425: netfs_sreq: R=00000008[7] DOWN +DON f=01 s=1c6000 3a000/3a000 e=0 kworker/u193:5-3716 [003] ..... 103.785432: netfs_progress: R=00000008[07] s=1c6000 ct=0/3a000 pa=3a000/3a000 sl=0
(gdb) p/x *rreq $3 = { { work = { data = { counter = 0xfffffffe00000 }, entry = { next = 0xffff8898b7df6908, prev = 0xffff8898b7df6908 }, func = 0xffffffff81b44e90 }, rcu = { next = 0xfffffffe00000, func = 0xffff8898b7df6908 } }, inode = 0xffff889886939518, mapping = 0xffff889886939670, iocb = 0x0, cache_resources = { ops = 0xffffffff845fd480, cache_priv = 0xffff8898b876ed20, cache_priv2 = 0xffff8898a7b29700, debug_id = 0x2, inval_counter = 0x0 }, ractl = 0xffff8898b88b7980, proc_link = { next = 0xffffffff857668a0, prev = 0xffff8898b7df3c60 }, subrequests = { next = 0xffff8898b87fd568, prev = 0xffff8898b87fd568 }, io_streams = {{ construct = 0x0, sreq_max_len = 0x10000, sreq_max_segs = 0x0, submit_off = 0x0, submit_len = 0x0, submit_extendable_to = 0x0, prepare_write = 0x0, issue_write = 0x0, subrequests = { next = 0xffff8898b7df69b0, prev = 0xffff8898b7df69b0 }, front = 0x0, collected_to = 0x0, transferred = 0x0, source = 0x0, error = 0x0, stream_nr = 0x0, avail = 0x0, active = 0x0, need_retry = 0x0, failed = 0x0 }, { construct = 0x0, sreq_max_len = 0x0, sreq_max_segs = 0x0, submit_off = 0x0, submit_len = 0x0, submit_extendable_to = 0x0, prepare_write = 0x0, issue_write = 0x0, subrequests = { next = 0xffff8898b7df6a18, prev = 0xffff8898b7df6a18 }, front = 0x0, collected_to = 0x0, transferred = 0x0, source = 0x0, error = 0x0, stream_nr = 0x0, avail = 0x0, active = 0x0, need_retry = 0x0, failed = 0x0 }}, group = 0x0, buffer = 0xffff8898b8000000, buffer_tail = 0xffff8898b8000000, iter = { iter_type = 0x4, nofault = 0x0, data_source = 0x0, iov_offset = 0x0, { __ubuf_iovec = { iov_base = 0xffff8898b8000000, iov_len = 0x0 }, { { __iov = 0xffff8898b8000000, kvec = 0xffff8898b8000000, bvec = 0xffff8898b8000000, folioq = 0xffff8898b8000000, xarray = 0xffff8898b8000000, ubuf = 0xffff8898b8000000 }, count = 0x0 } }, { nr_segs = 0x1, folioq_slot = 0x1, xarray_start = 0x1 } }, io_iter = { iter_type = 0x0, nofault = 0x0, data_source = 0x0, iov_offset = 0x0, { __ubuf_iovec = { iov_base = 0x0, iov_len = 0x0 }, { { __iov = 0x0, kvec = 0x0, bvec = 0x0, folioq = 0x0, xarray = 0x0, ubuf = 0x0 }, count = 0x0 } }, { nr_segs = 0x0, folioq_slot = 0x0, xarray_start = 0x0 } }, netfs_priv = 0xffff8898b041fa00, netfs_priv2 = 0x0, direct_bv = 0x0, direct_bv_count = 0x0, debug_id = 0x8, rsize = 0x0, wsize = 0x7fffffff, subreq_counter = { counter = 0x8 }, nr_group_rel = 0x0, lock = { { rlock = { raw_lock = { { val = { counter = 0x0 }, { locked = 0x0, pending = 0x0 }, { locked_pending = 0x0, tail = 0x0 } } } } } }, nr_outstanding = { counter = 0x1 }, submitted = 0x200000, len0 = 0x40000, len = 0x40000, transferred = 0x0, error = 0x0, origin = 0x0, direct_bv_unpin = 0x0, buffer_head_slot = 0x0, buffer_tail_slot = 0x0, i_size = 0x776749, start0 = 0x1c0000, start = 0x1c0000, issued_to = { counter = 0x0 }, collected_to = 0x0, cleaned_to = 0x0, no_unlock_folio = 0x0, prev_donated = 0x0, ref = { refs = { counter = 0x1 } }, flags = 0x80000020, netfs_ops = 0xffffffff8459dba0, cleanup = 0x0 } (gdb) p/x *subreq $4 = { rreq = 0xffff8898b7df6900, work = { data = { counter = 0xfffffffe00000 }, entry = { next = 0xffff8898b87fd550, prev = 0xffff8898b87fd550 }, func = 0x0 }, rreq_link = { next = 0xffff8898b7df6970, prev = 0xffff8898b7df6970 }, io_iter = { iter_type = 0x4, nofault = 0x0, data_source = 0x0, iov_offset = 0x34000, { __ubuf_iovec = { iov_base = 0xffff8898b8000000, iov_len = 0x6000 }, { { __iov = 0xffff8898b8000000, kvec = 0xffff8898b8000000, bvec = 0xffff8898b8000000, folioq = 0xffff8898b8000000, xarray = 0xffff8898b8000000, ubuf = 0xffff8898b8000000 }, count = 0x6000 } }, { nr_segs = 0x0, folioq_slot = 0x0, xarray_start = 0x0 } }, start0 = 0x1f4000, start = 0x1c6000, len0 = 0xc000, len = 0x3a000, transferred = 0x3a000, consumed = 0x0, prev_donated = 0x0, next_donated = 0x0, ref = { refs = { counter = 0x2 } }, error = 0x0, debug_index = 0x7, nr_segs = 0x0, retry_count = 0x0, source = 0x2, stream_nr = 0x0, curr_folioq_slot = 0x0, curr_folio_order = 0x6, curr_folioq = 0xffff8898b8000000, flags = 0x1 }