On 1/11/23 09:49, David Woodhouse wrote:
[13890.148203] ======================================================
[13890.148205] WARNING: possible circular locking dependency detected
[13890.148207] 6.1.0-rc4+ #1024 Tainted: G I E
[13890.148209] ------------------------------------------------------
[13890.148210] xen_shinfo_test/13326 is trying to acquire lock:
[13890.148212] ffff888107d493b0 (&gpc->lock){....}-{2:2}, at: kvm_xen_update_runstate_guest+0xf2/0x4e0 [kvm]
[13890.148285]
but task is already holding lock:
[13890.148287] ffff88887f671718 (&rq->__lock){-.-.}-{2:2}, at: __schedule+0x84/0x7c0
[13890.148295]
which lock already depends on the new lock.
...
[13890.148997] Chain exists of:
&gpc->lock --> &p->pi_lock --> &rq->__lock
I believe that's because of the priority inheritance check which
happens when there's *contention* for gpc->lock.
But in the majority of cases, anything taking a write lock on that (and
thus causing contention) is going to be invalidating the cache *anyway*
and causing us to abort. Or revalidating it, I suppose. But either way
we're racing with seeing an invalid state.
In which case we much as well just make it a trylock. Doing so *only*
for the scheduling-out atomic code path looks a bit like this... is
there a prettier way?
I'll add a prettier way (or at least encapsulate this one) when I send
the patches to introduce lock+check functions for gpc.
Paolo