Excerpts from Thomas Gleixner's message of April 23, 2022 1:53 am: > On Wed, Apr 13 2022 at 15:11, Nicholas Piggin wrote: >> So we traced the problem down to possibly a misunderstanding between >> decrementer clock event device and core code. >> >> The decrementer is only oneshot*ish*. It actually needs to either be >> reprogrammed or shut down otherwise it just continues to cause >> interrupts. > > I always thought that PPC had sane timers. That's really disillusioning. My comment was probably a bit misleading explanation of the whole situation. This weirdness is actually in software in the powerpc clock event driver due to a recent change I made assuming the clock event goes to oneshot-stopped. The hardware is relatively sane I think, global synchronized constant rate high frequency clock distributed to the CPUs so reads don't go off-core. And per-CPU "decrementer" event interrupt at the same frequency as the clock -- program it to a +ve value and it decrements until zero then creates basically a level triggered interrupt. Before my change, the decrementer interrupt would always clear the interrupt at entry. The event_handler usually programs another timer in so I tried to avoid that first clear counting on the oneshot_stopped callback to clear the interrupt if there was no other timer. >> Before commit 35de589cb879, it was sort of two-shot. The initial >> interrupt at the programmed time would set its internal next_tb variable >> to ~0 and call the ->event_handler(). If that did not set_next_event or >> stop the timer, the interrupt will fire again immediately, notice >> next_tb is ~0, and only then stop the decrementer interrupt. >> >> So that was already kind of ugly, this patch just turned it into a hang. >> >> The problem happens when the tick is stopped with an event still >> pending, then tick_nohz_handler() is called, but it bails out because >> tick_stopped == 1 so the device never gets programmed again, and so it >> keeps firing. >> >> How to fix it? Before commit a7cba02deced, powerpc's decrementer was >> really oneshot, but we would like to avoid doing that because it requires >> additional programming of the hardware on each timer interrupt. We have >> the ONESHOT_STOPPED state which seems to be just about what we want. >> >> Did the ONESHOT_STOPPED patch just miss this case, or is there a reason >> we don't stop it here? This patch seems to fix the hang (not heavily >> tested though). > > This was definitely overlooked, but it's arguable it is is not required > for real oneshot clockevent devices. This should only handle the case > where the interrupt was already pending. > > The ONESHOT_STOPPED state was introduced to handle the case where the > last timer gets canceled, so the already programmed event does not fire. > > It was not necessarily meant to "fix" clockevent devices which are > pretending to be ONESHOT, but keep firing over and over. > > That, said. I'm fine with the change along with a big fat comment why > this is required. Thanks for taking a look and confirming. I just sent a patch with a comment and what looks like another missed case. Hopefully it's okay. Thanks, Nick