Re: Report in unix_stream_read_generic()

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Wed, 16 Feb 2022 04:33:41 +0000

On Wed, Feb 16, 2022 at 01:17:03PM +0900, Byungchul Park wrote:
> [    7.013330] ===================================================
> [    7.013331] DEPT: Circular dependency has been detected.
> [    7.013332] 5.17.0-rc1-00014-gcf3441bb2012 #2 Tainted: G        W        
> [    7.013333] ---------------------------------------------------
> [    7.013334] summary
> [    7.013334] ---------------------------------------------------
> [    7.013335] *** DEADLOCK ***
> [    7.013335] 
> [    7.013335] context A
> [    7.013336]     [S] (unknown)(&(&ei->socket.wq.wait)->dmap:0)
> [    7.013337]     [W] __mutex_lock_common(&u->iolock:0)
> [    7.013338]     [E] event(&(&ei->socket.wq.wait)->dmap:0)
> [    7.013340] 
> [    7.013340] context B
> [    7.013341]     [S] __raw_spin_lock(&u->lock:0)
> [    7.013342]     [W] wait(&(&ei->socket.wq.wait)->dmap:0)
> [    7.013343]     [E] spin_unlock(&u->lock:0)

This seems unlikely to be real.  We're surely not actually waiting
while holding a spinlock; existing debug checks would catch it.

> [    7.013407] ---------------------------------------------------
> [    7.013407] context B's detail
> [    7.013408] ---------------------------------------------------
> [    7.013408] context B
> [    7.013409]     [S] __raw_spin_lock(&u->lock:0)
> [    7.013410]     [W] wait(&(&ei->socket.wq.wait)->dmap:0)
> [    7.013411]     [E] spin_unlock(&u->lock:0)
> [    7.013412] 
> [    7.013412] [S] __raw_spin_lock(&u->lock:0):
> [    7.013413] [<ffffffff81aa451f>] unix_stream_read_generic+0x6bf/0xb60
> [    7.013416] stacktrace:
> [    7.013416]       _raw_spin_lock+0x6e/0x90
> [    7.013418]       unix_stream_read_generic+0x6bf/0xb60

It would be helpful if you'd run this through scripts/decode_stacktrace.sh
so we could see line numbers instead of hex offsets (which arene't much
use without the binary kernel).

> [    7.013420]       unix_stream_recvmsg+0x40/0x50
> [    7.013422]       sock_read_iter+0x85/0xd0
> [    7.013424]       new_sync_read+0x162/0x180
> [    7.013426]       vfs_read+0xf3/0x190
> [    7.013428]       ksys_read+0xa6/0xc0
> [    7.013429]       do_syscall_64+0x3a/0x90
> [    7.013431]       entry_SYSCALL_64_after_hwframe+0x44/0xae
> [    7.013433] 
> [    7.013434] [W] wait(&(&ei->socket.wq.wait)->dmap:0):
> [    7.013434] [<ffffffff810bb017>] prepare_to_wait+0x47/0xd0

... this may be the source of confusion.  Just because we prepare to
wait doesn't mean we end up actually waiting.  For example, look at
unix_wait_for_peer():

        prepare_to_wait_exclusive(&u->peer_wait, &wait, TASK_INTERRUPTIBLE);

        sched = !sock_flag(other, SOCK_DEAD) &&
                !(other->sk_shutdown & RCV_SHUTDOWN) &&
                unix_recvq_full(other);

        unix_state_unlock(other);

        if (sched)
                timeo = schedule_timeout(timeo);

        finish_wait(&u->peer_wait, &wait);

We *prepare* to wait, *then* drop the lock, then actually schedule.