On Thu, 2024-05-02 at 12:05 -0400, Benjamin Marzinski wrote: > On Thu, May 02, 2024 at 05:14:56PM +0200, Martin Wilck wrote: > > On Thu, 2024-04-25 at 19:35 -0400, Benjamin Marzinski wrote: > > > Currently multipath makes sure that dev_loss_tmo is at least as > > > long > > > as > > > the configured no path queueing time. The goal is to make sure > > > that > > > path > > > devices aren't removed while the multipath device is still > > > queueing > > > in > > > hopes that they will become usable again. > > > > > > This is racy. Multipathd may take longer to check the paths than > > > configured. If strict_timing isn't set, it will definitely take > > > longer. > > > To account for this, pad the minimum dev_loss_tmo value by five > > > seconds > > > (one default checker interval) plus one second per minute of no > > > path > > > queueing time. > > > > What's the rationale for the "one second per minute" part? > > It feels kind of arbitrary to me, and I don't find it obvious that > > we > > need larger padding for larger timeouts. > > The way the checker works, without strict_timing, we do all the > checking > work, and then we wait for a second. Obviously, the checking work > takes > time, so every nominal one second checker loop will in reality take > longer than a second. The larger you have configured (no_path_retry * > checkint), the further away the actual time when we disable queueing > will be from (no_path_retry * checkint) seconds. > > With strict_timing on, things work better. As long as the checking > work > doesn't take more than a second, you will stay fairly close to one > second per loop, but even still, it can be longer. nanosleep() will > sleep till at least the time specified. If multipathd is not running > as > a realtime process, it will usually sleep for longer. > > Also, if the checker work takes longer than a second, then > strict_timing > will make sure that the loop takes (close to) exactly 2 (or whatever > the > next number is) seconds. However, retry_count_tick() still only ticks > down by 1 for each loop. This means that the real time before > queueing > is disabled can diverge from (no_path_retry * checkint). And again, > the > longer you've configured it to wait, the more it diverges. > > I thought about passing ticks to retry_count_tick(), so that it would > correctly deal with these multi-tick loops, but this can cause it to > diverge from the actual number of retries attempted, in a way that > would > mean we disable queueing before the desired number of retries have > occurred, which is a bigger problem, IMHO. No matter how long the > checker work takes, you will only ever do one path check per loop. In > the worse case, say that the checker work (or anything else in > multipathd) gets stuck for a couple of minutes. You would only do one > path check, but hundreds of ticks would be passed to the > retry_count_tick(), meaning that you could disable queueing after > only > one retry. Now you could limit the number of ticks you passed to > retry_count_tick(), but even still, if the a path was 1 tick away > from > being checked, and the checker work took 3 seconds, you would still > pass > 3 ticks to retry_count_tick(), which would cause it to start to > diverge > from the actual time it takes for the path to do the configured > amount > of retries. > > We could improve retry_count_tick(), so that it actually tracks the > number of retries you did, but that still leaves the problems when > either strict_timing isn't on or multipathd isn't a realtime process. > It seemed easier to just say, the longer you have configured > multipathd > to wait to disable queueing, the more the time when it actually > disables > queueing can diverge from the configured time. > > The number I picked for the rate of divergence is just a guest, and > if > you don't like it, I'm fine with removing it. After all, the whole > point > of this patchset is to make sure that the worst thing that can happen > if > paths disappear while we are still queueing is that a multipath > device > isn't autoremoved. I was just asking for a rationale. This is fine. I think we can, and should, improve multipathd's timing. Unfortunately this has been on my agenda for a long time already. For 4/5, too: Reviewed-by: Martin Wilck <mwilck@xxxxxxxx>