Hi Andi, On Tuesday, 17 December 2024 19:00:40 CET Janusz Krzysztofik wrote: > Hi Andi, > > On Tuesday, 17 December 2024 18:12:08 CET Andi Shyti wrote: > > Hi Janusz, > > > > ... > > > > > > > + > > > > > cond_resched(); > > > > > > > > > > - if (intel_gt_wait_for_idle(gt, HZ * 3) == -ETIME) { > > > > > + if (intel_gt_wait_for_idle(gt, HZ * timeout_ms / 500) == -ETIME) { > > > > > > > > where is this 500 coming from? > > > > > > / 1000 would convert it to seconds as needed, and / 500 used instead was > > > supposed to to mean that we are willing to wait for preempt_timeout_ms * > 2. > > > Sorry for that shortcut. Would you like me to provide a clarifying comment, > > > or maybe better use explicit 2 * preempt_timeout / 1000 ? > > > > It was clear that you were doubling it, but what's more > > interesting to know (perhaps in a comment) is why you are > > choosing to use the double of the timeout_ms instead of other > > values. > > > > Makes sense? > > Yes, good question. > > Is it possible for more than one bb to hang? If yes then should we wait > longer than the longest preemption timeout? Before I assumed that maybe we > should, just in case, but now, having that revisited and reconsidered, I tend > to agree that the longest preempt timeout, perhaps with a small margin (let's > say +100ms) should be enough to recover from a single failing test case. Let > me verify if that works for the linked case. I've done some testing and got a confirmation that the issue I'm trying to address in the first place requires a timeout almost twice as long as the longest preemption timeout. I propose the following correction: - if (intel_gt_wait_for_idle(gt, HZ * 3) == -ETIME) { + /* 2 x longest preempt timeout, experimentally determined */ + if (intel_gt_wait_for_idle(gt, 2 * timeout_ms * HZ / 1000) == -ETIME) { Thanks, Janusz > > Thanks, > Janusz > > > > > Thanks, > > Andi > > > > > > >