On 8/6/24 20:37, Joanne Koong wrote: > On Tue, Aug 6, 2024 at 11:26 AM Joanne Koong <joannelkoong@xxxxxxxxx> wrote: >> >> On Tue, Aug 6, 2024 at 10:11 AM Bernd Schubert >> <bernd.schubert@xxxxxxxxxxx> wrote: >>> >>> On 8/6/24 18:23, Joanne Koong wrote: >>> >>>>> >>>>> This is very interesting. These logs (and the ones above with the >>>>> lxcfs server running concurrently) are showing that the read request >>>>> was freed but not through the do_fuse_request_end path. It's weird >>>>> that fuse_simple_request reached fuse_put_request without >>>>> do_fuse_request_end having been called (which is the only place where >>>>> FR_FINISHED gets set and wakes up the wait events in >>>>> request_wait_answer). >>>>> >>>>> I'll take a deeper look tomorrow and try to make more sense of it. >>>> >>>> Finally realized what's happening! >>>> When we kill the cat program, if the request hasn't been sent out to >>>> userspace yet when the fatal signal interrupts the >>>> wait_event_interruptible and wait_event_killable in >>>> request_wait_answer(), this will clean up the request manually (not >>>> through the fuse_request_end() path), which doesn't delete the timer. >>>> >>>> I'll fix this for v3. >>>> >>>> Thank you for surfacing this and it would be much appreciated if you >>>> could test out v3 when it's submitted to make sure. >>> >>> It is still just a suggestion, but if the timer would have its own ref, >>> any oversight of another fuse_put_request wouldn't be fatal. >>> >> >> Thanks for the suggestion. My main concerns are whether it's worth the >> extra (minimal?) performance penalty for something that's not strictly >> needed and whether it ends up adding more of a burden to keep track of >> the timer ref (eg in error handling like the case above where the >> fatal signal is for a request that hasn't been sent to userspace yet, >> having to account for the extra timer ref if the timer callback didn't >> execute). I don't think adding a timer ref would prevent fatal crashes >> on fuse_put_request oversights (unless we also mess up not releasing a >> corresponding timer ref :)) > > I amend this last sentence - I just realized your point about the > fatal crashes is that if we accidentally miss a fuse_put_request > altogether, we'd also miss releasing the timer ref in that path, which > means the timer callback would be the one releasing the last ref. > Yeah, that is what I meant. It is a bit defensive coding, but I don't have a strong opinion about it. Thanks, Bernd