On 18/07/21 14:42, Hillf Danton wrote:
It's caused by the missing wakeup, i.e. eventfd_signal not really
signaling anything.
Can you please point me to the waiters in the mainline?
It's irqfd_wakeup.
There are two cases of write_seqcount_begin in x/virt/kvm/eventfd.c, and
in kvm_irqfd_deassign() it is surrounded by spin_lock_irq(&kvm->irqfds.lock)
that also protects irqfd_update().
What isnt clear is if the risk is zero that either case can be preempted by
seqcount reader. That risk may end up with the livelock described in
x/Documentation/locking/seqlock.rst.
Since the introduction of seqcount_spinlock_t, the writers automatically
disable preemption. This is definitely the right thing in this case
where the seqcount writers are small enough, and the readers are hot
enough, that using a local lock would be too heavyweight.
Without that, the livelock would be possible, though very unlikely. In
practice seqcount updates should only happen while the producer is
quiescent; and also the seqcount readers and writers will often be
pinned to separate CPUs.
Paolo
+A sequence counter write side critical section must never be preempted
+or interrupted by read side sections. Otherwise the reader will spin for
+the entire scheduler tick due to the odd sequence count value and the
+interrupted writer. If that reader belongs to a real-time scheduling
+class, it can spin forever and the kernel will livelock.