Hi everyone, On Thu, Nov 28, 2024 at 8:55 PM Sergey Senozhatsky <senozhatsky@xxxxxxxxxxxx> wrote: > > Cc-ing Tomasz > > On (24/11/28 11:23), Bernd Schubert wrote: > > > Thanks for the pointers again, Bernd. > > > > > >> Miklos had asked for to abort the connection in v4 > > >> https://lore.kernel.org/all/CAJfpegsiRNnJx7OAoH58XRq3zujrcXx94S2JACFdgJJ_b8FdHw@xxxxxxxxxxxxxx/raw > > > > > > OK, sounds reasonable. I'll try to give the series some testing in the > > > coming days. > > > > > > // I still would probably prefer "seconds" timeout granularity. > > > // Unless this also has been discussed already and Bernd has a link ;) > > > > > > The issue is that is currently iterating through 256 hash lists + > > pending + bg. > > > > https://lore.kernel.org/all/CAJnrk1b7bfAWWq_pFP=4XH3ddc_9GtAM2mE7EgWnx2Od+UUUjQ@xxxxxxxxxxxxxx/raw > > Oh, I see. > > > Personally I would prefer a second list to avoid the check spike and latency > > https://lore.kernel.org/linux-fsdevel/9ba4eaf4-b9f0-483f-90e5-9512aded419e@xxxxxxxxxxx/raw > > That's good to know. I like the idea of less CPU usage in general, > our devices a battery powered so everything counts, to some extent. > > > What is your opinion about that? I guess android and chromium have an > > interest low latencies and avoiding cpu spikes? > > Good question. > > Can't speak for android, in chromeos we probably will keep it at 1 minute, > but this is because our DEFAULT_HUNG_TASK_TIMEOUT is larger than that (we > use default value of 120 sec). There are setups that might use lower > values, or even re-define default value, e.g.: > > arch/arc/configs/axs101_defconfig:CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=10 > arch/arc/configs/axs103_defconfig:CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=10 > arch/arc/configs/axs103_smp_defconfig:CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=10 > arch/arc/configs/hsdk_defconfig:CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=10 > arch/arc/configs/vdk_hs38_defconfig:CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=10 > arch/arc/configs/vdk_hs38_smp_defconfig:CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=10 > arch/powerpc/configs/mvme5100_defconfig:CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=20 > > In those cases 1 minute fuse timeout will overshot HUNG_TASK_TIMEOUT > and then the question is whether HUNG_TASK_PANIC is set. > > On the other hand, setups that set much lower timeout than > DEFAULT_HUNG_TASK_TIMEOUT=120 will have extra CPU activities regardless, > just because watchdogs will run more often. > > Tomasz, any opinions? First of all, thanks everyone for looking into this. How about keeping a list of requests in the FIFO order (in other words: first entry is the first to timeout) and whenever the first entry is being removed from the list (aka the request actually completes), re-arming the timer to the timeout of the next request in the list? This way we don't really have any timer firing unless there is really a request that timed out. (In fact, we could optimize it even further by opportunistically scheduling a timer slightly later and opportunistically handling timed out requests when other requests are being completed, but this would be optimizing for the slow path, so probably an overkill.) As for the length of the request timeout vs the hung task watchdog timeout, my opinion is that we should make sure that the hung task watchdog doesn't hit in any case, simply because a misbehaving userspace process must not be able to panic the kernel. In the blk-core, the blk_io_schedule() function [1] uses sysctl_hung_task_timeout_secs to determine the maximum length of a single uninterruptible sleep. I suppose we could use the same calculation to obtain our timeout number. What does everyone think? [1] https://elixir.bootlin.com/linux/v6.12.1/source/block/blk-core.c#L1232 Best regards, Tomasz