On Tue, Aug 6, 2024 at 8:43 AM Bernd Schubert <bernd.schubert@xxxxxxxxxxx> wrote: > > > > On 8/6/24 00:10, Joanne Koong wrote: > > On Mon, Aug 5, 2024 at 6:26 AM Bernd Schubert > > <bernd.schubert@xxxxxxxxxxx> wrote: > >>>> @@ -1280,6 +1389,7 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file, > >>>> if (args->opcode == FUSE_SETXATTR) > >>>> req->out.h.error = -E2BIG; > >>>> fuse_request_end(req); > >>>> + fuse_put_request(req); > >>>> goto restart; > >>> > >>> While rereading through fuse_dev_do_read, I just realized we also need > >>> to handle the race condition for the error edge cases (here and in the > >>> "goto out_end;"), since the timeout handler could have finished > >>> executing by the time we hit the error edge case. We need to > >>> test_and_set_bit(FR_FINISHING) so that either the timeout_handler or > >>> dev_do_read cleans up the request, but not both. I'll fix this for v3. > >> > >> I know it would change semantics a bit, but wouldn't it be much easier / > >> less racy if fuse_dev_do_read() would delete the timer when it takes a > >> request from fiq->pending and add it back in (with new timeouts) before > >> it returns the request? > >> > > > > Ooo I really like this idea! I'm worried though that this might allow > > potential scenarios where the fuse_dev_do_read gets descheduled after > > disarming the timer and a non-trivial amount of time elapses before it > > gets scheduled back (eg on a system where the CPU is starved), in > > which case the fuse req_timeout value will be (somewhat of) a lie. If > > you and others think this is likely fine though, then I'll incorporate > > this into v3 which will make this logic a lot simpler :) > > > > In my opinion we only need to worry about fuse server getting stuck. I > think we would have a grave issue if fuse_dev_do_read() gets descheduled > for a long time - the timer might not run either in that case. Main > issue I see with removing/re-adding the timer - it might double the > timeout in worst case. In my personal opinion acceptable as it reduces > code complexity. > Awesome, thanks for this suggestion Bernd! I'll make this change for v3, this will get rid of having to handle all the possible races between dev_do_read and the timeout handler. I'm planning to rearm the timer with its original req->timer.expires (which was set to "jiffies + fc->req_timeout" at the time the timer was started the first time), so I think this will retain the original timeout and won't add any extra time to it. And according to the timer docs, "if @timer->expires is already in the past @timer will be queued to expire at the next timer tick". Thanks, Joanne > > Thanks > Bernd