On Thu Apr 27 2023, Jan Kiszka wrote: > On 26.04.23 23:29, Thomas Gleixner wrote: >> On Wed, Apr 26 2023 at 12:03, Zdenek Bouska wrote: >>> following patch is my current approach for fixing this issue. I introduced >>> big_cpu_relax(), which uses Will's implementation [1] on ARM64 without >>> LSE atomics and original cpu_relax() on any other CPU. >> >> Why is this interrupt handling specific? Just because it's the place >> where you observed it? >> >> That's a general issue for any code which uses atomics for forward >> progress. LL/SC simply does not guarantee that. >> >> So if that helps, then this needs to be addressed globaly and not with >> some crude hack in the interrupt handling code. > > My impression is that the retry loop of irq_finalize_oneshot is > particularly susceptible to that issue due to the high acquire/relax > pressure and inter-dependency between holder and waiter it generates - > which does not mean it cannot occur in other places. > > Are we aware of other concrete case where it bites? Even with just > "normal" contented spin_lock usage? Well, some years ago I've observed a similar problem with ARM64 spinlocks, cpu_relax() and retry loops (in the futex code). It also generated latency spikes up to 2-3ms. Back then, it was easily reproducible using stress-ng --ptrace 4. Thanks, Kurt
Attachment:
signature.asc
Description: PGP signature