On Thu, Mar 05, 2020 at 04:06:40PM -0500, Waiman Long wrote: > A lockdep circular locking dependency report was seen when running a > keyutils test: > > [12537.027242] ====================================================== > [12537.059309] WARNING: possible circular locking dependency detected > [12537.088148] 4.18.0-147.7.1.el8_1.x86_64+debug #1 Tainted: G OE --------- - - > [12537.125253] ------------------------------------------------------ > [12537.153189] keyctl/25598 is trying to acquire lock: > [12537.175087] 000000007c39f96c (&mm->mmap_sem){++++}, at: __might_fault+0xc4/0x1b0 > [12537.208365] > [12537.208365] but task is already holding lock: > [12537.234507] 000000003de5b58d (&type->lock_class){++++}, at: keyctl_read_key+0x15a/0x220 > [12537.270476] > [12537.270476] which lock already depends on the new lock. > [12537.270476] > [12537.307209] > [12537.307209] the existing dependency chain (in reverse order) is: > [12537.340754] > [12537.340754] -> #3 (&type->lock_class){++++}: > [12537.367434] down_write+0x4d/0x110 > [12537.385202] __key_link_begin+0x87/0x280 > [12537.405232] request_key_and_link+0x483/0xf70 > [12537.427221] request_key+0x3c/0x80 > [12537.444839] dns_query+0x1db/0x5a5 [dns_resolver] > [12537.468445] dns_resolve_server_name_to_ip+0x1e1/0x4d0 [cifs] > [12537.496731] cifs_reconnect+0xe04/0x2500 [cifs] > [12537.519418] cifs_readv_from_socket+0x461/0x690 [cifs] > [12537.546263] cifs_read_from_socket+0xa0/0xe0 [cifs] > [12537.573551] cifs_demultiplex_thread+0x311/0x2db0 [cifs] > [12537.601045] kthread+0x30c/0x3d0 > [12537.617906] ret_from_fork+0x3a/0x50 > [12537.636225] > [12537.636225] -> #2 (root_key_user.cons_lock){+.+.}: > [12537.664525] __mutex_lock+0x105/0x11f0 > [12537.683734] request_key_and_link+0x35a/0xf70 > [12537.705640] request_key+0x3c/0x80 > [12537.723304] dns_query+0x1db/0x5a5 [dns_resolver] > [12537.746773] dns_resolve_server_name_to_ip+0x1e1/0x4d0 [cifs] > [12537.775607] cifs_reconnect+0xe04/0x2500 [cifs] > [12537.798322] cifs_readv_from_socket+0x461/0x690 [cifs] > [12537.823369] cifs_read_from_socket+0xa0/0xe0 [cifs] > [12537.847262] cifs_demultiplex_thread+0x311/0x2db0 [cifs] > [12537.873477] kthread+0x30c/0x3d0 > [12537.890281] ret_from_fork+0x3a/0x50 > [12537.908649] > [12537.908649] -> #1 (&tcp_ses->srv_mutex){+.+.}: > [12537.935225] __mutex_lock+0x105/0x11f0 > [12537.954450] cifs_call_async+0x102/0x7f0 [cifs] > [12537.977250] smb2_async_readv+0x6c3/0xc90 [cifs] > [12538.000659] cifs_readpages+0x120a/0x1e50 [cifs] > [12538.023920] read_pages+0xf5/0x560 > [12538.041583] __do_page_cache_readahead+0x41d/0x4b0 > [12538.067047] ondemand_readahead+0x44c/0xc10 > [12538.092069] filemap_fault+0xec1/0x1830 > [12538.111637] __do_fault+0x82/0x260 > [12538.129216] do_fault+0x419/0xfb0 > [12538.146390] __handle_mm_fault+0x862/0xdf0 > [12538.167408] handle_mm_fault+0x154/0x550 > [12538.187401] __do_page_fault+0x42f/0xa60 > [12538.207395] do_page_fault+0x38/0x5e0 > [12538.225777] page_fault+0x1e/0x30 > [12538.243010] > [12538.243010] -> #0 (&mm->mmap_sem){++++}: > [12538.267875] lock_acquire+0x14c/0x420 > [12538.286848] __might_fault+0x119/0x1b0 > [12538.306006] keyring_read_iterator+0x7e/0x170 > [12538.327936] assoc_array_subtree_iterate+0x97/0x280 > [12538.352154] keyring_read+0xe9/0x110 > [12538.370558] keyctl_read_key+0x1b9/0x220 > [12538.391470] do_syscall_64+0xa5/0x4b0 > [12538.410511] entry_SYSCALL_64_after_hwframe+0x6a/0xdf > [12538.435535] > [12538.435535] other info that might help us debug this: > [12538.435535] > [12538.472829] Chain exists of: > [12538.472829] &mm->mmap_sem --> root_key_user.cons_lock --> &type->lock_class > [12538.472829] > [12538.524820] Possible unsafe locking scenario: > [12538.524820] > [12538.551431] CPU0 CPU1 > [12538.572654] ---- ---- > [12538.595865] lock(&type->lock_class); > [12538.613737] lock(root_key_user.cons_lock); > [12538.644234] lock(&type->lock_class); > [12538.672410] lock(&mm->mmap_sem); > [12538.687758] > [12538.687758] *** DEADLOCK *** > [12538.687758] > [12538.714455] 1 lock held by keyctl/25598: > [12538.732097] #0: 000000003de5b58d (&type->lock_class){++++}, at: keyctl_read_key+0x15a/0x220 > [12538.770573] > [12538.770573] stack backtrace: > [12538.790136] CPU: 2 PID: 25598 Comm: keyctl Kdump: loaded Tainted: G > [12538.844855] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 12/27/2015 > [12538.881963] Call Trace: > [12538.892897] dump_stack+0x9a/0xf0 > [12538.907908] print_circular_bug.isra.25.cold.50+0x1bc/0x279 > [12538.932891] ? save_trace+0xd6/0x250 > [12538.948979] check_prev_add.constprop.32+0xc36/0x14f0 > [12538.971643] ? keyring_compare_object+0x104/0x190 > [12538.992738] ? check_usage+0x550/0x550 > [12539.009845] ? sched_clock+0x5/0x10 > [12539.025484] ? sched_clock_cpu+0x18/0x1e0 > [12539.043555] __lock_acquire+0x1f12/0x38d0 > [12539.061551] ? trace_hardirqs_on+0x10/0x10 > [12539.080554] lock_acquire+0x14c/0x420 > [12539.100330] ? __might_fault+0xc4/0x1b0 > [12539.119079] __might_fault+0x119/0x1b0 > [12539.135869] ? __might_fault+0xc4/0x1b0 > [12539.153234] keyring_read_iterator+0x7e/0x170 > [12539.172787] ? keyring_read+0x110/0x110 > [12539.190059] assoc_array_subtree_iterate+0x97/0x280 > [12539.211526] keyring_read+0xe9/0x110 > [12539.227561] ? keyring_gc_check_iterator+0xc0/0xc0 > [12539.249076] keyctl_read_key+0x1b9/0x220 > [12539.266660] do_syscall_64+0xa5/0x4b0 > [12539.283091] entry_SYSCALL_64_after_hwframe+0x6a/0xdf > > One way to prevent this deadlock scenario from happening is to not > allow writing to userspace while holding the key semaphore. Instead, > an internal buffer is allocated for getting the keys out from the > read method first before copying them out to userspace without holding > the lock. > > That requires taking out the __user modifier from the read methods as > well as additional changes to not use any userspace write helpers. > > Signed-off-by: Waiman Long <longman@xxxxxxxxxx> Hi, quickly checked this (it is Friday night in Finland) and did not see anything obvious that'd be wrong. I'll give a more detailed look next week. Thank you. /Jarkko