Eric Dumazet wrote:
[PATCH] net: fix a lockdep splat
We have for each socket :
One spinlock (sk_slock.slock)
One rwlock (sk_callback_lock)
Possible scenarios are :
(A) (this is used in net/sunrpc/xprtsock.c)
read_lock(&sk->sk_callback_lock) (without blocking BH)
<BH>
spin_lock(&sk->sk_slock.slock);
...
read_lock(&sk->sk_callback_lock);
...
(B)
write_lock_bh(&sk->sk_callback_lock)
stuff
write_unlock_bh(&sk->sk_callback_lock)
(C)
spin_lock_bh(&sk->sk_slock)
...
write_lock_bh(&sk->sk_callback_lock)
stuff
write_unlock_bh(&sk->sk_callback_lock)
spin_unlock_bh(&sk->sk_slock)
This (C) case conflicts with (A) :
CPU1 [A] CPU2 [C]
read_lock(callback_lock)
<BH> spin_lock_bh(slock)
<wait to spin_lock(slock)>
<wait to write_lock_bh(callback_lock)>
We have one problematic (C) use case in inet_csk_listen_stop() :
local_bh_disable();
bh_lock_sock(child); // spin_lock_bh(&sk->sk_slock)
WARN_ON(sock_owned_by_user(child));
...
sock_orphan(child); // write_lock_bh(&sk->sk_callback_lock)
lockdep is not happy with this, as reported by Tetsuo Handa
This patch makes sure inet_csk_listen_stop() uses following lock order :
write_lock_bh(&sk->sk_callback_lock)
spin_lock(&sk->sk_slock)
...
spin_unlock(&sk->sk_slock)
write_unlock_bh(&sk->sk_callback_lock)
IMHO this order conflicts with (A) too (but I'm not sure lockdep
tracks that):
CPU1 [A] CPU2 [C-reversed]
... write_lock_bh(callback_lock)
<BH>
spin_lock(slock)
<wait to spin_lock(slock)>
<wait to read_lock(callback_lock)>
My proposal is to BH protect read_lock(sk_callback_lock) everywhere (it's
done by netfilter in a few places already).
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html