On (24/12/02 11:29), Joanne Koong wrote: > > >> In those cases 1 minute fuse timeout will overshot HUNG_TASK_TIMEOUT > > >> and then the question is whether HUNG_TASK_PANIC is set. > > >> > > >> On the other hand, setups that set much lower timeout than > > >> DEFAULT_HUNG_TASK_TIMEOUT=120 will have extra CPU activities regardless, > > >> just because watchdogs will run more often. > > >> > > >> Tomasz, any opinions? > > > > > > First of all, thanks everyone for looking into this. > > Hi Sergey and Tomasz, > > Sorry for the late reply - I was out the last couple of days. Thanks > Bernd for weighing in and answering the questions! > > > > > > > How about keeping a list of requests in the FIFO order (in other > > > words: first entry is the first to timeout) and whenever the first > > > entry is being removed from the list (aka the request actually > > > completes), re-arming the timer to the timeout of the next request in > > > the list? This way we don't really have any timer firing unless there > > > is really a request that timed out. > > I think the issue with this is that we likely would end up wasting > more cpu cycles. For a busy FUSE server, there could be hundreds > (thousands?) of requests that happen within the span of > FUSE_TIMEOUT_TIMER_FREQ seconds. So, a silly question - can we not do that maybe? What I'm thinking about is what if instead of implementing fuse-watchdog and tracking jiffies per request we'd switch to timeout aware operations and use what's already in the kernel? E.g. instead of wait_event() we'd use wait_event_timeout() and would configure timeout per connection (also bringing in current hung-task-watchdog timeout value into the equation), using MAX_SCHEDULE_TIMEOUT as a default (similarly to what core kernel does). The first req that timeouts kills its siblings and the connection.