On Sun, Aug 4, 2024 at 12:48 AM Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > > On Sat, Aug 3, 2024 at 3:05 AM Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > > > On Wed, Jul 31, 2024 at 7:47 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > > > > > > On Thu, Aug 1, 2024 at 2:46 AM Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > > > > > > > On Wed, Jul 31, 2024 at 10:52 AM Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > > > > > > > > > On Tue, Jul 30, 2024 at 7:14 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > > > > > > > > > > > > On Wed, Jul 31, 2024 at 2:16 AM Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > > > > > > > > > > > > > On Mon, Jul 29, 2024 at 11:00 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > > > > > > > > > > > > > > > > On Tue, Jul 30, 2024 at 8:28 AM Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > There are situations where fuse servers can become unresponsive or take > > > > > > > > > too long to reply to a request. Currently there is no upper bound on > > > > > > > > > how long a request may take, which may be frustrating to users who get > > > > > > > > > stuck waiting for a request to complete. > > > > > > > > > > > > > > > > > > This patchset adds a timeout option for requests and two dynamically > > > > > > > > > configurable fuse sysctls "default_request_timeout" and "max_request_timeout" > > > > > > > > > for controlling/enforcing timeout behavior system-wide. > > > > > > > > > > > > > > > > > > Existing fuse servers will not be affected unless they explicitly opt into the > > > > > > > > > timeout. > > > > > > > > > > > > > > > > > > v1: https://lore.kernel.org/linux-fsdevel/20240717213458.1613347-1-joannelkoong@xxxxxxxxx/ > > > > > > > > > Changes from v1: > > > > > > > > > - Add timeout for background requests > > > > > > > > > - Handle resend race condition > > > > > > > > > - Add sysctls > > > > > > > > > > > > > > > > > > Joanne Koong (2): > > > > > > > > > fuse: add optional kernel-enforced timeout for requests > > > > > > > > > fuse: add default_request_timeout and max_request_timeout sysctls > > > > > > > > > > > > > > > > > > Documentation/admin-guide/sysctl/fs.rst | 17 +++ > > > > > > > > > fs/fuse/Makefile | 2 +- > > > > > > > > > fs/fuse/dev.c | 187 +++++++++++++++++++++++- > > > > > > > > > fs/fuse/fuse_i.h | 30 ++++ > > > > > > > > > fs/fuse/inode.c | 24 +++ > > > > > > > > > fs/fuse/sysctl.c | 42 ++++++ > > > > > > > > > 6 files changed, 293 insertions(+), 9 deletions(-) > > > > > > > > > create mode 100644 fs/fuse/sysctl.c > > > > > > > > > > > > > > > > > > -- > > > > > > > > > 2.43.0 > > > > > > > > > > > > > > > > > > > > > > > > > Hello Joanne, > > > > > > > > > > > > > > > > Thanks for your update. > > > > > > > > > > > > > > > > I have tested your patches using my test case, which is similar to the > > > > > > > > hello-fuse [0] example, with an additional change as follows: > > > > > > > > > > > > > > > > @@ -125,6 +125,8 @@ static int hello_read(const char *path, char *buf, > > > > > > > > size_t size, off_t offset, > > > > > > > > } else > > > > > > > > size = 0; > > > > > > > > > > > > > > > > + // TO trigger timeout > > > > > > > > + sleep(60); > > > > > > > > return size; > > > > > > > > } > > > > > > > > > > > > > > > > [0] https://github.com/libfuse/libfuse/blob/master/example/hello.c > > > > > > > > > > > > > > > > However, it triggered a crash with the following setup: > > > > > > > > > > > > > > > > 1. Set FUSE timeout: > > > > > > > > sysctl -w fs.fuse.default_request_timeout=10 > > > > > > > > sysctl -w fs.fuse.max_request_timeout = 20 > > > > > > > > > > > > > > > > 2. Start FUSE daemon: > > > > > > > > ./hello /tmp/fuse > > > > > > > > > > > > > > > > 3. Read from FUSE: > > > > > > > > cat /tmp/fuse/hello > > > > > > > > > > > > > > > > 4. Kill the process within 10 seconds (to avoid the timeout being triggered). > > > > > > > > Then the crash will be triggered. > > > > > > > > > > > > > > Hi Yafang, > > > > > > > > > > > > > > Thanks for trying this out on your use case! > > > > > > > > > > > > > > How consistently are you able to repro this? > > > > > > > > > > > > It triggers the crash every time. > > > > > > > > > > > > > I tried reproing using > > > > > > > your instructions above but I'm not able to get the crash. > > > > > > > > > > > > Please note that it is the `cat /tmp/fuse/hello` process that was > > > > > > killed, not the fuse daemon. > > > > > > The crash seems to occur when the fuse daemon wakes up after > > > > > > sleep(60). Please ensure that the fuse daemon can be woken up. > > > > > > > > > > > > > > > > I'm still not able to trigger the crash by killing the `cat > > > > > /tmp/fuse/hello` process. This is how I'm repro-ing > > > > > > > > > > 1) Add sleep to test code in > > > > > https://github.com/libfuse/libfuse/blob/master/example/hello.c > > > > > @@ -125,6 +126,9 @@ static int hello_read(const char *path, char *buf, > > > > > size_t size, off_t offset, > > > > > } else > > > > > size = 0; > > > > > > > > > > + sleep(60); > > > > > + printf("hello_read woke up from sleep\n"); > > > > > + > > > > > return size; > > > > > } > > > > > > > > > > 2) Set fuse timeout to 10 seconds > > > > > sysctl -w fs.fuse.default_request_timeout=10 > > > > > > > > > > 3) Start fuse daemon > > > > > ./example/hello ./tmp/fuse > > > > > > > > > > 4) Read from fuse > > > > > cat /tmp/fuse/hello > > > > > > > > > > 5) Get pid of cat process > > > > > top -b | grep cat > > > > > > > > > > 6) Kill cat process (within 10 seconds) > > > > > sudo kill -9 <cat-pid> > > > > > > > > > > 7) Wait 60 seconds for fuse's read request to complete > > > > > > > > > > From what it sounds like, this is exactly what you are doing as well? > > > > > > > > > > I added some kernel-side logs and I'm seeing that the read request is > > > > > timing out after ~10 seconds and handled by the timeout handler > > > > > successfully. > > > > > > > > > > On the fuse daemon side, these are the logs I'm seeing from the above repro: > > > > > ./example/hello /tmp/fuse -f -d > > > > > > > > > > FUSE library version: 3.17.0 > > > > > nullpath_ok: 0 > > > > > unique: 2, opcode: INIT (26), nodeid: 0, insize: 104, pid: 0 > > > > > INIT: 7.40 > > > > > flags=0x73fffffb > > > > > max_readahead=0x00020000 > > > > > INIT: 7.40 > > > > > flags=0x4040f039 > > > > > max_readahead=0x00020000 > > > > > max_write=0x00100000 > > > > > max_background=0 > > > > > congestion_threshold=0 > > > > > time_gran=1 > > > > > unique: 2, success, outsize: 80 > > > > > unique: 4, opcode: LOOKUP (1), nodeid: 1, insize: 46, pid: 673 > > > > > LOOKUP /hello > > > > > getattr[NULL] /hello > > > > > NODEID: 2 > > > > > unique: 4, success, outsize: 144 > > > > > unique: 6, opcode: OPEN (14), nodeid: 2, insize: 48, pid: 673 > > > > > open flags: 0x8000 /hello > > > > > open[0] flags: 0x8000 /hello > > > > > unique: 6, success, outsize: 32 > > > > > unique: 8, opcode: READ (15), nodeid: 2, insize: 80, pid: 673 > > > > > read[0] 4096 bytes from 0 flags: 0x8000 > > > > > unique: 10, opcode: FLUSH (25), nodeid: 2, insize: 64, pid: 673 > > > > > unique: 10, error: -38 (Function not implemented), outsize: 16 > > > > > unique: 11, opcode: INTERRUPT (36), nodeid: 0, insize: 48, pid: 0 > > > > > FUSE_INTERRUPT: reply to kernel to disable interrupt > > > > > unique: 11, error: -38 (Function not implemented), outsize: 16 > > > > > > > > > > unique: 12, opcode: RELEASE (18), nodeid: 2, insize: 64, pid: 0 > > > > > unique: 12, success, outsize: 16 > > > > > > > > > > hello_read woke up from sleep > > > > > read[0] 13 bytes from 0 > > > > > unique: 8, success, outsize: 29 > > > > > > > > > > > > > > > Are these the debug logs you are seeing from the daemon side as well? > > > > > > > > > > Thanks, > > > > > Joanne > > > > > > > > > > > > > > From the crash logs you provided below, it looks like what's happening > > > > > > > is that if the process gets killed, the timer isn't getting deleted. > > > > > > > > When I looked at this log previously, I thought you were repro-ing by > > > > killing the fuse daemon process, not the cat process. When we kill the > > > > cat process, the timer shouldn't be getting deleted. (if the daemon > > > > itself is killed, the timers get deleted) > > > > > > > > > > > I'll look more into what happens in fuse when a process is killed and > > > > > > > get back to you on this. > > > > > > > > This is the flow of what is happening on the kernel side (verified by > > > > local printks) - > > > > > > > > `cat /tmp/fuse/hello`: > > > > Issues a FUSE_READ background request (via fuse_send_readpages(), > > > > fm->fc->async_read). This request will have a timeout of 10 seconds on > > > > it > > > > > > > > The cat process is killed: > > > > This does not clean up the request. The request is still on the fpq > > > > processing list. > > > > > > > > Timeout on request expires: > > > > The timeout handler runs and properly cleans up / frees the request. > > > > > > > > Fuse daemon wakes from sleep and replies to the request: > > > > In dev_do_write(), the kernel won't be able to find this request > > > > (since it timed out and was removed from the fpq processing list) and > > > > return with -ENOENT > > > > > > Thank you for your explanation. > > > I will verify if there are any issues with my test environment. > > > > > Hi Yafang, > > > > Would you mind adding these printks to your kernel when you run the > > repro and pasting what they show? > > > > --- a/fs/fuse/dev.c > > +++ b/fs/fuse/dev.c > > @@ -287,6 +287,9 @@ static void do_fuse_request_end(struct fuse_req > > *req, bool from_timer_callback) > > struct fuse_conn *fc = fm->fc; > > struct fuse_iqueue *fiq = &fc->iq; > > > > + printk("do_fuse_request_end: req=%p, from_timer=%d, > > req->timer.func=%d\n", > > + req, from_timer_callback, req->timer.function != NULL); > > + > > if (from_timer_callback) > > req->out.h.error = -ETIME; > > > > @@ -415,6 +418,8 @@ static void fuse_request_timeout(struct timer_list *timer) > > { > > struct fuse_req *req = container_of(timer, struct fuse_req, timer); > > > > + printk("fuse_request_timeout: req=%p\n", req); > > + > > /* > > * Request reply is being finished by the kernel right now. > > * No need to time out the request. > > @@ -612,6 +617,7 @@ ssize_t fuse_simple_request(struct fuse_mount *fm, > > struct fuse_args *args) > > > > if (!args->noreply) > > __set_bit(FR_ISREPLY, &req->flags); > > + printk("fuse_simple_request: req=%p, op=%u\n", req, args->opcode); > > __fuse_request_send(req); > > ret = req->out.h.error; > > if (!ret && args->out_argvar) { > > @@ -673,6 +679,7 @@ int fuse_simple_background(struct fuse_mount *fm, > > struct fuse_args *args, > > > > fuse_args_to_req(req, args); > > > > + printk("fuse_background_request: req=%p, op=%u\n", req, args->opcode); > > if (!fuse_request_queue_background(req)) { > > fuse_put_request(req); > > > > > > When I run it on my side, I see > > > > [ 68.117740] fuse_background_request: req=00000000874e2f14, op=26 > > [ 68.131440] do_fuse_request_end: req=00000000874e2f14, > > from_timer=0, req->timer.func=1 > > [ 71.558538] fuse_simple_request: req=00000000cf643ace, op=1 > > [ 71.559651] do_fuse_request_end: req=00000000cf643ace, > > from_timer=0, req->timer.func=1 > > [ 71.561044] fuse_simple_request: req=00000000f2c001f0, op=14 > > [ 71.562524] do_fuse_request_end: req=00000000f2c001f0, > > from_timer=0, req->timer.func=1 > > [ 71.563820] fuse_background_request: req=00000000584f2cc3, op=15 > > [ 78.580035] fuse_simple_request: req=00000000ecbee970, op=25 > > [ 78.582614] do_fuse_request_end: req=00000000ecbee970, > > from_timer=0, req->timer.func=1 > > [ 81.624722] fuse_request_timeout: req=00000000584f2cc3 > > [ 81.625443] do_fuse_request_end: req=00000000584f2cc3, > > from_timer=1, req->timer.func=1 > > [ 81.626377] fuse_background_request: req=00000000b2d792ed, op=18 > > [ 81.627623] do_fuse_request_end: req=00000000b2d792ed, > > from_timer=0, req->timer.func=1 > > > > I'm seeing only one timer get called, on the read request (opcode=15), > > and I'm not seeing do_fuse_request_end having been called on that > > request before the timer is invoked. > > I'm curious to compare this against the logs on your end. > > The log on my side is as follows, Thank you Yafang. These logs are very helpful. > > [ 283.329421] fuse_background_request: req=000000002b4f82d4, op=26 > [ 283.330043] do_fuse_request_end: req=000000002b4f82d4, > from_timer=0, req->timer.func=0 > [ 287.889844] fuse_simple_request: req=00000000865e85bf, op=3 > [ 287.889914] do_fuse_request_end: req=00000000865e85bf, > from_timer=0, req->timer.func=0 > [ 287.889933] fuse_simple_request: req=00000000865e85bf, op=22 > [ 287.889994] do_fuse_request_end: req=00000000865e85bf, > from_timer=0, req->timer.func=0 > [ 287.890096] fuse_simple_request: req=00000000865e85bf, op=27 > [ 287.890130] do_fuse_request_end: req=00000000865e85bf, > from_timer=0, req->timer.func=0 > [ 287.890142] fuse_simple_request: req=00000000865e85bf, op=28 > [ 287.890167] do_fuse_request_end: req=00000000865e85bf, > from_timer=0, req->timer.func=0 > [ 287.890178] fuse_simple_request: req=00000000865e85bf, op=1 > [ 287.890191] do_fuse_request_end: req=00000000865e85bf, > from_timer=0, req->timer.func=0 > [ 287.890209] fuse_simple_request: req=00000000865e85bf, op=28 > [ 287.890216] do_fuse_request_end: req=00000000865e85bf, > from_timer=0, req->timer.func=0 > [ 287.890222] fuse_background_request: req=00000000865e85bf, op=29 > [ 287.890230] do_fuse_request_end: req=00000000865e85bf, > from_timer=0, req->timer.func=0 > [ 312.311752] fuse_background_request: req=00000000a8da8b44, op=26 > [ 312.312249] do_fuse_request_end: req=00000000a8da8b44, > from_timer=0, req->timer.func=1 > [ 317.368786] fuse_simple_request: req=00000000bc4817dd, op=1 > [ 317.368871] do_fuse_request_end: req=00000000bc4817dd, > from_timer=0, req->timer.func=1 > [ 317.368910] fuse_simple_request: req=00000000bc4817dd, op=14 > [ 317.368942] do_fuse_request_end: req=00000000bc4817dd, > from_timer=0, req->timer.func=1 > [ 317.368967] fuse_simple_request: req=00000000bc4817dd, op=15 > [ 327.855189] fuse_request_timeout: req=00000000bc4817dd > [ 327.855195] do_fuse_request_end: req=00000000bc4817dd, > from_timer=1, req->timer.func=1 > [ 327.855218] fuse_simple_request: req=00000000c34cc363, op=15 > [ 327.855328] fuse_simple_request: req=00000000c34cc363, op=25 > [ 327.855401] do_fuse_request_end: req=00000000c34cc363, > from_timer=0, req->timer.func=1 > [ 327.855496] fuse_background_request: req=00000000c34cc363, op=18 > [ 327.855508] do_fuse_request_end: req=00000000c34cc363, > from_timer=0, req->timer.func=1 > [ 338.095136] Oops: general protection fault, probably for > non-canonical address 0xdead00000000012a: 0000 [#1] PREEMPT SMP NOPTI > [ 338.096415] CPU: 58 PID: 0 Comm: swapper/58 Kdump: loaded Not > tainted 6.10.0+ #8 > [ 338.098219] RIP: 0010:__run_timers+0x27e/0x360 > [ 338.098686] Code: 07 48 c7 43 08 00 00 00 00 48 85 c0 74 78 4d 8b > 2f 4c 89 6b 08 0f 1f 44 00 00 49 8b 45 00 49 8b 55 08 48 89 02 48 85 > c0 74 04 <48> 89 50 08 4d 8b 65 18 49 c7 45 08 00 00 00 00 48 b8 22 01 > 00 00 > [ 338.100381] RSP: 0018:ffffb4ef808bced8 EFLAGS: 00010086 > [ 338.100907] RAX: dead000000000122 RBX: ffff9827ffca13c0 RCX: 0000000000000001 > [ 338.101623] RDX: ffffb4ef808bcef8 RSI: 0000000000000000 RDI: ffff9827ffca13e8 > [ 338.102333] RBP: ffffb4ef808bcf70 R08: 000000000000008b R09: ffff9827ffca1430 > [ 338.103020] R10: ffffffff93e060c0 R11: 0000000000000089 R12: 0000000000000001 > [ 338.103726] R13: ffff97e9dc06a0a0 R14: 0000000100009200 R15: ffffb4ef808bcef8 > [ 338.104439] FS: 0000000000000000(0000) GS:ffff9827ffc80000(0000) > knlGS:0000000000000000 > [ 338.105229] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 338.105795] CR2: 000000c002f99340 CR3: 0000000148254001 CR4: 0000000000370ef0 > [ 338.106502] Call Trace: > [ 338.106836] <IRQ> > [ 338.107175] ? show_regs+0x69/0x80 > [ 338.107603] ? die_addr+0x38/0x90 > [ 338.108005] ? exc_general_protection+0x236/0x490 > [ 338.108557] ? asm_exc_general_protection+0x27/0x30 > [ 338.109095] ? __run_timers+0x27e/0x360 > [ 338.109563] ? __run_timers+0x1b4/0x360 > [ 338.110009] ? kvm_sched_clock_read+0x11/0x20 > [ 338.110528] ? sched_clock_noinstr+0x9/0x10 > [ 338.111002] ? sched_clock+0x10/0x30 > [ 338.111447] ? sched_clock_cpu+0x10/0x190 > [ 338.111914] run_timer_softirq+0x3a/0x60 > [ 338.112406] handle_softirqs+0x118/0x350 > [ 338.112859] irq_exit_rcu+0x60/0x80 > [ 338.113295] sysvec_apic_timer_interrupt+0x7f/0x90 > [ 338.113823] </IRQ> > [ 338.114147] <TASK> > [ 338.114447] asm_sysvec_apic_timer_interrupt+0x1b/0x20 > [ 338.115002] RIP: 0010:default_idle+0xb/0x20 > [ 338.115498] Code: 00 4d 29 c8 4c 01 c7 4c 29 c2 e9 6e ff ff ff 90 > 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 eb 07 0f 00 2d b3 51 33 > 00 fb f4 <fa> c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 > 00 90 > [ 338.117337] RSP: 0018:ffffb4ef8028fe18 EFLAGS: 00000246 > [ 338.117894] RAX: 0000000000004000 RBX: 0000000000000001 RCX: 0001b48ebb3a1032 > [ 338.118673] RDX: 0000000000000001 RSI: ffffffff9412e060 RDI: ffff9827ffcbc8e0 > [ 338.119415] RBP: ffffb4ef8028fe20 R08: 0000004eb7fb01b4 R09: 0000000000000001 > [ 338.120151] R10: ffffffff93e56080 R11: 0000000000000001 R12: 0000000000000001 > [ 338.120872] R13: ffffffff9412e060 R14: ffffffff9412e0e0 R15: 0000000000000001 > [ 338.121615] ? ct_kernel_exit.constprop.0+0x79/0x90 > [ 338.122171] ? arch_cpu_idle+0x9/0x10 > [ 338.122602] default_enter_idle+0x22/0x2f > [ 338.123064] cpuidle_enter_state+0x88/0x430 > [ 338.123556] cpuidle_enter+0x34/0x50 > [ 338.123978] call_cpuidle+0x22/0x50 > [ 338.124449] cpuidle_idle_call+0xd2/0x120 > [ 338.124909] do_idle+0x77/0xd0 > [ 338.125313] cpu_startup_entry+0x2c/0x30 > [ 338.125763] start_secondary+0x117/0x140 > [ 338.126240] common_startup_64+0x13e/0x141 > [ 338.126711] </TASK> > > In addition to the hello-fuse, there is another FUSE daemon, lxcfs, > running on my test server. After disabling lxcfs, the system no longer > panics, but there are still error logs: > > [ 285.804534] fuse_background_request: req=0000000063502a93, op=26 > [ 285.805041] do_fuse_request_end: req=0000000063502a93, > from_timer=0, req->timer.func=1 > [ 290.967412] fuse_simple_request: req=000000003f362e4b, op=1 > [ 290.967480] do_fuse_request_end: req=000000003f362e4b, > from_timer=0, req->timer.func=1 > [ 290.967517] fuse_simple_request: req=000000003f362e4b, op=14 > [ 290.967585] do_fuse_request_end: req=000000003f362e4b, > from_timer=0, req->timer.func=1 > [ 290.967655] fuse_simple_request: req=000000003f362e4b, op=15 > [ 300.996023] fuse_request_timeout: req=000000003f362e4b > [ 300.996030] do_fuse_request_end: req=000000003f362e4b, > from_timer=1, req->timer.func=1 > [ 300.996066] fuse_simple_request: req=00000000b4182f02, op=15 > [ 300.996180] fuse_simple_request: req=000000003f362e4b, op=25 > [ 300.996185] ================================================================== > [ 300.996980] BUG: KFENCE: use-after-free write in enqueue_timer+0x24/0xb0 > > [ 300.997788] Use-after-free write at 0x0000000022312cb7 (in kfence-#156): > [ 300.998476] enqueue_timer+0x24/0xb0 > [ 300.998479] __mod_timer+0x23b/0x360 > [ 300.998481] add_timer+0x20/0x30 > [ 300.998483] fuse_simple_request+0x1bc/0x2f0 [fuse] > [ 300.998506] fuse_flush+0x1ac/0x1f0 [fuse] > [ 300.998511] filp_flush+0x39/0x90 > [ 300.998517] filp_close+0x15/0x30 > [ 300.998519] put_files_struct+0x77/0xe0 > [ 300.998522] exit_files+0x47/0x60 > [ 300.998524] do_exit+0x262/0x480 > [ 300.998528] do_group_exit+0x34/0x90 > [ 300.998531] get_signal+0x92f/0x980 > [ 300.998534] arch_do_signal_or_restart+0x2a/0x100 > [ 300.998537] syscall_exit_to_user_mode+0xe3/0x1a0 > [ 300.998541] do_syscall_64+0x71/0x170 > [ 300.998545] entry_SYSCALL_64_after_hwframe+0x76/0x7e > > [ 300.998759] kfence-#156: 0x00000000b4182f02-0x0000000084fc5c46, > size=200, cache=ip4-frags > > [ 300.998761] allocated by task 15064 on cpu 26 at 300.996061s: > [ 300.998766] fuse_request_alloc+0x21/0xb0 [fuse] > [ 300.998771] fuse_get_req+0xde/0x270 [fuse] > [ 300.998775] fuse_simple_request+0x33/0x2f0 [fuse] > [ 300.998779] fuse_do_readpage+0x15e/0x200 [fuse] > [ 300.998783] fuse_read_folio+0x29/0x60 [fuse] > [ 300.998787] filemap_read_folio+0x3b/0xe0 > [ 300.998791] filemap_update_page+0x236/0x2d0 > [ 300.998792] filemap_get_pages+0x225/0x390 > [ 300.998794] filemap_read+0xed/0x3a0 > [ 300.998796] generic_file_read_iter+0xb8/0x100 > [ 300.998798] fuse_file_read_iter+0xd8/0x150 [fuse] > [ 300.998804] vfs_read+0x25e/0x340 > [ 300.998806] ksys_read+0x67/0xf0 > [ 300.998808] __x64_sys_read+0x19/0x20 > [ 300.998810] x64_sys_call+0x1709/0x20b0 > [ 300.998813] do_syscall_64+0x65/0x170 > [ 300.998815] entry_SYSCALL_64_after_hwframe+0x76/0x7e > > [ 300.998817] freed by task 15064 on cpu 26 at 300.996084s: > [ 300.998822] fuse_put_request+0x89/0xf0 [fuse] > [ 300.998826] fuse_simple_request+0xe1/0x2f0 [fuse] > [ 300.998830] fuse_do_readpage+0x15e/0x200 [fuse] > [ 300.998835] fuse_read_folio+0x29/0x60 [fuse] > [ 300.998839] filemap_read_folio+0x3b/0xe0 > [ 300.998840] filemap_update_page+0x236/0x2d0 > [ 300.998842] filemap_get_pages+0x225/0x390 > [ 300.998844] filemap_read+0xed/0x3a0 > [ 300.998846] generic_file_read_iter+0xb8/0x100 > [ 300.998848] fuse_file_read_iter+0xd8/0x150 [fuse] > [ 300.998852] vfs_read+0x25e/0x340 > [ 300.998854] ksys_read+0x67/0xf0 > [ 300.998856] __x64_sys_read+0x19/0x20 > [ 300.998857] x64_sys_call+0x1709/0x20b0 > [ 300.998859] do_syscall_64+0x65/0x170 > [ 300.998860] entry_SYSCALL_64_after_hwframe+0x76/0x7e This is very interesting. These logs (and the ones above with the lxcfs server running concurrently) are showing that the read request was freed but not through the do_fuse_request_end path. It's weird that fuse_simple_request reached fuse_put_request without do_fuse_request_end having been called (which is the only place where FR_FINISHED gets set and wakes up the wait events in request_wait_answer). I'll take a deeper look tomorrow and try to make more sense of it. > > [ 300.999115] CPU: 26 PID: 15064 Comm: cat Kdump: loaded Not tainted 6.10.0+ #8 > [ 301.000803] ================================================================== > [ 301.001695] do_fuse_request_end: req=000000003f362e4b, > from_timer=0, req->timer.func=1 > [ 301.001723] fuse_background_request: req=000000003f362e4b, op=18 > [ 301.001767] do_fuse_request_end: req=000000003f362e4b, > from_timer=0, req->timer.func=1 > [ 311.235964] fuse_request_timeout: req=00000000b4182f02 > [ 311.235969] ------------[ cut here ]------------ > [ 311.235970] list_del corruption, ffff9a8072d3a000->next is > LIST_POISON1 (dead000000000100) > [ 311.235982] WARNING: CPU: 26 PID: 0 at lib/list_debug.c:56 > __list_del_entry_valid_or_report+0x8a/0xf0 > [ 311.236036] CPU: 26 PID: 0 Comm: swapper/26 Kdump: loaded Tainted: > G B 6.10.0+ #8 > [ 311.236040] RIP: 0010:__list_del_entry_valid_or_report+0x8a/0xf0 > [ 311.236043] Code: 31 c0 5d c3 cc cc cc cc 48 c7 c7 60 7a 5e b0 e8 > cc ea a4 ff 0f 0b 31 c0 5d c3 cc cc cc cc 48 c7 c7 88 7a 5e b0 e8 b6 > ea a4 ff <0f> 0b 31 c0 5d c3 cc cc cc cc 48 89 ca 48 c7 c7 c0 7a 5e b0 > e8 9d > [ 311.236045] RSP: 0018:ffffb6364056ce60 EFLAGS: 00010282 > [ 311.236047] RAX: 0000000000000000 RBX: ffff9a8072d3a0a0 RCX: 0000000000000027 > [ 311.236048] RDX: ffff9a807f4a0848 RSI: 0000000000000001 RDI: ffff9a807f4a0840 > [ 311.236049] RBP: ffffb6364056ce60 R08: 0000000000000000 R09: ffffb6364056cce0 > [ 311.236050] R10: ffffb6364056ccd8 R11: ffffffffb1017ee8 R12: ffff9a8072d3a000 > [ 311.236051] R13: ffff9a420d5af000 R14: 0000000100002800 R15: ffff9a420d5af054 > [ 311.236054] FS: 0000000000000000(0000) GS:ffff9a807f480000(0000) > knlGS:0000000000000000 > [ 311.236056] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 311.236057] CR2: 000000c000dc5000 CR3: 000000010cc38003 CR4: 0000000000370ef0 > [ 311.236058] Call Trace: > [ 311.236059] <IRQ> > [ 311.236061] ? show_regs+0x69/0x80 > [ 311.236065] ? __warn+0x88/0x130 > [ 311.236068] ? __list_del_entry_valid_or_report+0x8a/0xf0 > [ 311.236070] ? report_bug+0x18f/0x1a0 > [ 311.236074] ? handle_bug+0x40/0x70 > [ 311.236077] ? exc_invalid_op+0x19/0x70 > [ 311.236079] ? asm_exc_invalid_op+0x1b/0x20 > [ 311.236083] ? __list_del_entry_valid_or_report+0x8a/0xf0 > [ 311.236086] fuse_request_timeout+0x15c/0x1a0 [fuse] > [ 311.236094] ? __pfx_fuse_request_timeout+0x10/0x10 [fuse] > [ 311.236099] call_timer_fn+0x2c/0x130 > [ 311.236102] ? __pfx_fuse_request_timeout+0x10/0x10 [fuse] > [ 311.236106] __run_timers+0x2c2/0x360 > [ 311.236108] ? kvm_sched_clock_read+0x11/0x20 > [ 311.236110] ? sched_clock_noinstr+0x9/0x10 > [ 311.236111] ? sched_clock+0x10/0x30 > [ 311.236114] ? sched_clock_cpu+0x10/0x190 > [ 311.236116] run_timer_softirq+0x3a/0x60 > [ 311.236118] handle_softirqs+0x118/0x350 > [ 311.236121] irq_exit_rcu+0x60/0x80 > [ 311.236123] sysvec_apic_timer_interrupt+0x7f/0x90 > [ 311.236124] </IRQ> > [ 311.236125] <TASK> > [ 311.236126] asm_sysvec_apic_timer_interrupt+0x1b/0x20 > [ 311.236128] RIP: 0010:default_idle+0xb/0x20 > [ 311.236130] Code: 00 4d 29 c8 4c 01 c7 4c 29 c2 e9 6e ff ff ff 90 > 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 eb 07 0f 00 2d b3 51 33 > 00 fb f4 <fa> c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 > 00 90 > [ 311.236131] RSP: 0018:ffffb6364018fe18 EFLAGS: 00000246 > [ 311.236133] RAX: 0000000000004000 RBX: 0000000000000001 RCX: 0001c0582ca6ada0 > [ 311.236134] RDX: 0000000000000001 RSI: ffffffffb112e060 RDI: ffff9a807f4bc8e0 > [ 311.236135] RBP: ffffb6364018fe20 R08: 00000048770ca8ec R09: 0000000000000001 > [ 311.236135] R10: ffffffffb0e56080 R11: 0000000000000001 R12: 0000000000000001 > [ 311.236136] R13: ffffffffb112e060 R14: ffffffffb112e0e0 R15: 0000000000000001 > [ 311.236138] ? ct_kernel_exit.constprop.0+0x79/0x90 > [ 311.236140] ? arch_cpu_idle+0x9/0x10 > [ 311.236142] default_enter_idle+0x22/0x2f > [ 311.236144] cpuidle_enter_state+0x88/0x430 > [ 311.236146] cpuidle_enter+0x34/0x50 > [ 311.236150] call_cpuidle+0x22/0x50 > [ 311.236151] cpuidle_idle_call+0xd2/0x120 > [ 311.236154] do_idle+0x77/0xd0 > [ 311.236156] cpu_startup_entry+0x2c/0x30 > [ 311.236158] start_secondary+0x117/0x140 > [ 311.236160] common_startup_64+0x13e/0x141 > [ 311.236163] </TASK> > [ 311.236163] ---[ end trace 0000000000000000 ]--- > [ 311.236165] do_fuse_request_end: req=00000000b4182f02, > from_timer=1, req->timer.func=1 > [ 311.236166] ------------[ cut here ]------------ > [ 311.236167] refcount_t: underflow; use-after-free. > [ 311.236174] WARNING: CPU: 26 PID: 0 at lib/refcount.c:28 > refcount_warn_saturate+0xc2/0x110 > [ 311.236207] CPU: 26 PID: 0 Comm: swapper/26 Kdump: loaded Tainted: > G B W 6.10.0+ #8 > [ 311.236209] RIP: 0010:refcount_warn_saturate+0xc2/0x110 > [ 311.236211] Code: 01 e8 d2 72 a6 ff 0f 0b 5d c3 cc cc cc cc 80 3d > 33 d4 b1 01 00 75 81 48 c7 c7 30 69 5e b0 c6 05 23 d4 b1 01 01 e8 ae > 72 a6 ff <0f> 0b 5d c3 cc cc cc cc 80 3d 0d d4 b1 01 00 0f 85 59 ff ff > ff 48 > [ 311.236212] RSP: 0018:ffffb6364056cdf8 EFLAGS: 00010286 > [ 311.236213] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000027 > [ 311.236214] RDX: ffff9a807f4a0848 RSI: 0000000000000001 RDI: ffff9a807f4a0840 > [ 311.236215] RBP: ffffb6364056cdf8 R08: 0000000000000000 R09: ffffb6364056cc78 > [ 311.236216] R10: ffffb6364056cc70 R11: ffffffffb1017ee8 R12: ffff9a8072d3a000 > [ 311.236217] R13: ffff9a420d5af000 R14: ffff9a42426a6ec0 R15: ffff9a8072d3a010 > [ 311.236219] FS: 0000000000000000(0000) GS:ffff9a807f480000(0000) > knlGS:0000000000000000 > [ 311.236221] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 311.236222] CR2: 000000c000dc5000 CR3: 000000010cc38003 CR4: 0000000000370ef0 > [ 311.236223] Call Trace: > [ 311.236223] <IRQ> > [ 311.236224] ? show_regs+0x69/0x80 > [ 311.236233] ? __warn+0x88/0x130 > [ 311.236235] ? refcount_warn_saturate+0xc2/0x110 > [ 311.236236] ? report_bug+0x18f/0x1a0 > [ 311.236238] ? handle_bug+0x40/0x70 > [ 311.236240] ? exc_invalid_op+0x19/0x70 > [ 311.236242] ? asm_exc_invalid_op+0x1b/0x20 > [ 311.236244] ? refcount_warn_saturate+0xc2/0x110 > [ 311.236246] ? refcount_warn_saturate+0xc2/0x110 > [ 311.236247] fuse_put_request+0xc6/0xf0 [fuse] > [ 311.236253] do_fuse_request_end+0xcc/0x1e0 [fuse] > [ 311.236258] fuse_request_timeout+0xac/0x1a0 [fuse] > [ 311.236263] ? __pfx_fuse_request_timeout+0x10/0x10 [fuse] > [ 311.236267] call_timer_fn+0x2c/0x130 > [ 311.236269] ? __pfx_fuse_request_timeout+0x10/0x10 [fuse] > [ 311.236274] __run_timers+0x2c2/0x360 > [ 311.236275] ? kvm_sched_clock_read+0x11/0x20 > [ 311.236277] ? sched_clock_noinstr+0x9/0x10 > [ 311.236278] ? sched_clock+0x10/0x30 > [ 311.236280] ? sched_clock_cpu+0x10/0x190 > [ 311.236281] run_timer_softirq+0x3a/0x60 > [ 311.236283] handle_softirqs+0x118/0x350 > [ 311.236285] irq_exit_rcu+0x60/0x80 > [ 311.236286] sysvec_apic_timer_interrupt+0x7f/0x90 > [ 311.236288] </IRQ> > [ 311.236288] <TASK> > [ 311.236289] asm_sysvec_apic_timer_interrupt+0x1b/0x20 > [ 311.236291] RIP: 0010:default_idle+0xb/0x20 > [ 311.236293] Code: 00 4d 29 c8 4c 01 c7 4c 29 c2 e9 6e ff ff ff 90 > 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 eb 07 0f 00 2d b3 51 33 > 00 fb f4 <fa> c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 > 00 90 > [ 311.236294] RSP: 0018:ffffb6364018fe18 EFLAGS: 00000246 > [ 311.236295] RAX: 0000000000004000 RBX: 0000000000000001 RCX: 0001c0582ca6ada0 > [ 311.236296] RDX: 0000000000000001 RSI: ffffffffb112e060 RDI: ffff9a807f4bc8e0 > [ 311.236297] RBP: ffffb6364018fe20 R08: 00000048770ca8ec R09: 0000000000000001 > [ 311.236298] R10: ffffffffb0e56080 R11: 0000000000000001 R12: 0000000000000001 > [ 311.236299] R13: ffffffffb112e060 R14: ffffffffb112e0e0 R15: 0000000000000001 > [ 311.236300] ? ct_kernel_exit.constprop.0+0x79/0x90 > [ 311.236302] ? arch_cpu_idle+0x9/0x10 > [ 311.236304] default_enter_idle+0x22/0x2f > [ 311.236306] cpuidle_enter_state+0x88/0x430 > [ 311.236308] cpuidle_enter+0x34/0x50 > [ 311.236310] call_cpuidle+0x22/0x50 > [ 311.236311] cpuidle_idle_call+0xd2/0x120 > [ 311.236313] do_idle+0x77/0xd0 > [ 311.236315] cpu_startup_entry+0x2c/0x30 > [ 311.236317] start_secondary+0x117/0x140 > [ 311.236318] common_startup_64+0x13e/0x141 > [ 311.236320] </TASK> > [ 311.236321] ---[ end trace 0000000000000000 ]--- > > I wish I could provide you with a clear explanation of what happened > in my test environment, but I haven't had the time to delve into the > details yet. > > > -- > Regards > Yafang