On Thu, May 02, 2024 at 05:14:56PM +0200, Martin Wilck wrote: > On Thu, 2024-04-25 at 19:35 -0400, Benjamin Marzinski wrote: > > Currently multipath makes sure that dev_loss_tmo is at least as long > > as > > the configured no path queueing time. The goal is to make sure that > > path > > devices aren't removed while the multipath device is still queueing > > in > > hopes that they will become usable again. > > > > This is racy. Multipathd may take longer to check the paths than > > configured. If strict_timing isn't set, it will definitely take > > longer. > > To account for this, pad the minimum dev_loss_tmo value by five > > seconds > > (one default checker interval) plus one second per minute of no path > > queueing time. > > What's the rationale for the "one second per minute" part? > It feels kind of arbitrary to me, and I don't find it obvious that we > need larger padding for larger timeouts. The way the checker works, without strict_timing, we do all the checking work, and then we wait for a second. Obviously, the checking work takes time, so every nominal one second checker loop will in reality take longer than a second. The larger you have configured (no_path_retry * checkint), the further away the actual time when we disable queueing will be from (no_path_retry * checkint) seconds. With strict_timing on, things work better. As long as the checking work doesn't take more than a second, you will stay fairly close to one second per loop, but even still, it can be longer. nanosleep() will sleep till at least the time specified. If multipathd is not running as a realtime process, it will usually sleep for longer. Also, if the checker work takes longer than a second, then strict_timing will make sure that the loop takes (close to) exactly 2 (or whatever the next number is) seconds. However, retry_count_tick() still only ticks down by 1 for each loop. This means that the real time before queueing is disabled can diverge from (no_path_retry * checkint). And again, the longer you've configured it to wait, the more it diverges. I thought about passing ticks to retry_count_tick(), so that it would correctly deal with these multi-tick loops, but this can cause it to diverge from the actual number of retries attempted, in a way that would mean we disable queueing before the desired number of retries have occurred, which is a bigger problem, IMHO. No matter how long the checker work takes, you will only ever do one path check per loop. In the worse case, say that the checker work (or anything else in multipathd) gets stuck for a couple of minutes. You would only do one path check, but hundreds of ticks would be passed to the retry_count_tick(), meaning that you could disable queueing after only one retry. Now you could limit the number of ticks you passed to retry_count_tick(), but even still, if the a path was 1 tick away from being checked, and the checker work took 3 seconds, you would still pass 3 ticks to retry_count_tick(), which would cause it to start to diverge from the actual time it takes for the path to do the configured amount of retries. We could improve retry_count_tick(), so that it actually tracks the number of retries you did, but that still leaves the problems when either strict_timing isn't on or multipathd isn't a realtime process. It seemed easier to just say, the longer you have configured multipathd to wait to disable queueing, the more the time when it actually disables queueing can diverge from the configured time. The number I picked for the rate of divergence is just a guest, and if you don't like it, I'm fine with removing it. After all, the whole point of this patchset is to make sure that the worst thing that can happen if paths disappear while we are still queueing is that a multipath device isn't autoremoved. -Ben > > Regards > Martin