[4.9-rc1+] overlayfs lockdep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



----- Original Message -----
> From: "Jan Kara" <jack@xxxxxxx>
> Sent: Friday, October 7, 2016 3:08:38 AM
> Subject: Re: local DoS - systemd hang or timeout (WAS: Re: [RFC][CFT] splice_read reworked)
> 
> 
> So I believe this may be just a problem in overlayfs lockdep annotation
> (see below). Added Miklos to CC.
> 
> > Wait. There is also a lockep happened before the xfs internal error as
> > well.
> > 
> > [ 5839.452325] ======================================================
> > [ 5839.459221] [ INFO: possible circular locking dependency detected ]
> > [ 5839.466215] 4.8.0-rc8-splice-fixw-proc+ #4 Not tainted
> > [ 5839.471945] -------------------------------------------------------
> > [ 5839.478937] trinity-c220/69531 is trying to acquire lock:
> > [ 5839.484961]  (&p->lock){+.+.+.}, at: [<ffffffff812ac69c>]
> > seq_read+0x4c/0x3e0
> > [ 5839.492967]
> > but task is already holding lock:
> > [ 5839.499476]  (sb_writers#8){.+.+.+}, at: [<ffffffff81284be1>]
> > __sb_start_write+0xd1/0xf0
> > [ 5839.508560]
> > which lock already depends on the new lock.
> > 
> > [ 5839.517686]
> > the existing dependency chain (in reverse order) is:
> > [ 5839.526036]
> > -> #3 (sb_writers#8){.+.+.+}:
> > [ 5839.530751]        [<ffffffff810ff174>] lock_acquire+0xd4/0x240
> > [ 5839.537368]        [<ffffffff810f8f4a>] percpu_down_read+0x4a/0x90
> > [ 5839.544275]        [<ffffffff81284be1>] __sb_start_write+0xd1/0xf0
> > [ 5839.551181]        [<ffffffff812a8544>] mnt_want_write+0x24/0x50
> > [ 5839.557892]        [<ffffffffa04a398f>] ovl_want_write+0x1f/0x30
> > [overlay]
> > [ 5839.565577]        [<ffffffffa04a6036>] ovl_do_remove+0x46/0x480
> > [overlay]
> > [ 5839.573259]        [<ffffffffa04a64a3>] ovl_unlink+0x13/0x20 [overlay]
> > [ 5839.580555]        [<ffffffff812918ea>] vfs_unlink+0xda/0x190
> > [ 5839.586979]        [<ffffffff81293698>] do_unlinkat+0x268/0x2b0
> > [ 5839.593599]        [<ffffffff8129419b>] SyS_unlinkat+0x1b/0x30
> > [ 5839.600120]        [<ffffffff81003c9c>] do_syscall_64+0x6c/0x1e0
> > [ 5839.606836]        [<ffffffff817d4a3f>] return_from_SYSCALL_64+0x0/0x7a
> > [ 5839.614231]
> 
> So here is IMO the real culprit: do_unlinkat() grabs fs freeze protection
> through mnt_want_write(), we grab also i_rwsem in do_unlinkat() in
> I_MUTEX_PARENT class a bit after that and further down in vfs_unlink() we
> grab i_rwsem for the unlinked inode itself in default I_MUTEX class. Then
> in ovl_want_write() we grab freeze protection again, but this time for the
> upper filesystem. That establishes sb_writers (overlay) -> I_MUTEX_PARENT
> (overlay) -> I_MUTEX (overlay) -> sb_writers (FS-A) lock ordering
> (we maintain locking classes per fs type so that's why I'm showing fs type
> in parenthesis).
> 
> Now this nesting is nasty because once you add locks that are not tracked
> per fs type into the mix, you get cycles. In this case we've got
> seq_file->lock and cred_guard_mutex into the mix - the splice path is
> doing sb_writers (FS-A) -> seq_file->lock -> cred_guard_mutex (splicing
> from seq_file into the real filesystem). Exec path further establishes
> cred_guard_mutex -> I_MUTEX (overlay) which closes the full cycle:
> 
> sb_writers (FS-A) -> seq_file->lock -> cred_guard_mutex -> i_mutex
> (overlay) -> sb_writers (FS-A)
> 
> If I analyzed the lockdep trace, this looks like a real (although remote)
> deadlock possibility. Miklos?

So this can still be reproduced in the yesterday's mainline.

[40581.813575] [ INFO: possible circular locking dependency detected ]
[40581.813578] 4.9.0-rc1-lockfix-uncorev2+ #51 Tainted: G        W      
[40581.813581] -------------------------------------------------------
[40581.813582] trinity-c104/39795 is trying to acquire lock:
[40581.813587]  (
[40581.813588] &p->lock
[40581.813589] ){+.+.+.}
[40581.813600] , at: 
[40581.813601] [<ffffffff8191588c>] seq_read+0xec/0x1400
[40581.813603] 
[40581.813603] but task is already holding lock:
[40581.813605]  (
[40581.813607] sb_writers
[40581.813608] #8
[40581.813609] ){.+.+.+}
[40581.813617] , at: 
[40581.813617] [<ffffffff81889c6a>] do_sendfile+0x9ea/0x1270
[40581.813618] 
[40581.813618] which lock already depends on the new lock.
[40581.813618] 
[40581.813620] 
[40581.813620] the existing dependency chain (in reverse order) is:
[40581.813623] 
[40581.813623] -> #3
[40581.813624]  (
[40581.813625] sb_writers
[40581.813626] #8
[40581.813628] ){.+.+.+}
[40581.813628] :
[40581.813636]        
[40581.813636] [<ffffffff8133dcda>] __lock_acquire+0x9aa/0x1710
[40581.813640]        
[40581.813640] [<ffffffff8133fd4e>] lock_acquire+0x24e/0x5d0
[40581.813644]        
[40581.813645] [<ffffffff8189037e>] __sb_start_write+0xae/0x360
[40581.813650]        
[40581.813650] [<ffffffff819066fa>] mnt_want_write+0x4a/0xc0
[40581.813661]        
[40581.813661] [<ffffffffa16cdfbd>] ovl_want_write+0x8d/0xf0 [overlay]
[40581.813668]        
[40581.813668] [<ffffffffa16d4dc7>] ovl_do_remove+0xe7/0x9a0 [overlay]
[40581.813675]        
[40581.813676] [<ffffffffa16d5696>] ovl_rmdir+0x16/0x20 [overlay]
[40581.813680]        
[40581.813680] [<ffffffff818af90f>] vfs_rmdir+0x1bf/0x3e0
[40581.813685]        
[40581.813686] [<ffffffff818c5965>] do_rmdir+0x2c5/0x430
[40581.813689]        
[40581.813690] [<ffffffff818c8242>] SyS_unlinkat+0x22/0x30
[40581.813696]        
[40581.813696] [<ffffffff8100924d>] do_syscall_64+0x19d/0x540
[40581.813704]        
[40581.813704] [<ffffffff82c8af24>] return_from_SYSCALL_64+0x0/0x7a
[40581.813707] 
[40581.813707] -> #2
[40581.813709]  (
[40581.813710] &sb->s_type->i_mutex_key
[40581.813711] #17
[40581.813712] ){++++++}
[40581.813713] :
[40581.813720]        
[40581.813720] [<ffffffff8133dcda>] __lock_acquire+0x9aa/0x1710
[40581.813726]        
[40581.813726] [<ffffffff8133fd4e>] lock_acquire+0x24e/0x5d0
[40581.813736]        
[40581.813736] [<ffffffff82c84261>] down_read+0xa1/0x1c0
[40581.813740]        
[40581.813740] [<ffffffff818ae2db>] lookup_slow+0x17b/0x4f0
[40581.813744]        
[40581.813744] [<ffffffff818bb228>] walk_component+0x728/0x1d10
[40581.813750]        
[40581.813750] [<ffffffff818bcc1e>] link_path_walk+0x40e/0x1690
[40581.813758]        
[40581.813758] [<ffffffff818c0274>] path_openat+0x1c4/0x3870
[40581.813764]        
[40581.813765] [<ffffffff818c6d19>] do_filp_open+0x1a9/0x2e0
[40581.813772]        
[40581.813772] [<ffffffff8189832b>] do_open_execat+0xcb/0x420
[40581.813783]        
[40581.813784] [<ffffffff8189932b>] open_exec+0x2b/0x50
[40581.813793]        
[40581.813793] [<ffffffff819ea78c>] load_elf_binary+0x103c/0x3550
[40581.813807]        
[40581.813807] [<ffffffff8189a852>] search_binary_handler+0x162/0x480
[40581.813814]        
[40581.813815] [<ffffffff818a106a>] do_execveat_common.isra.24+0x138a/0x2570
[40581.813823]        
[40581.813824] [<ffffffff818a2efa>] SyS_execve+0x3a/0x50
[40581.813828]        
[40581.813828] [<ffffffff8100924d>] do_syscall_64+0x19d/0x540
[40581.813833]        
[40581.813834] [<ffffffff82c8af24>] return_from_SYSCALL_64+0x0/0x7a
[40581.813843] 
[40581.813843] -> #1
[40581.813845]  (
[40581.813850] &sig->cred_guard_mutex
[40581.813852] ){+.+.+.}
[40581.813852] :
[40581.813861]        
[40581.813862] [<ffffffff8133dcda>] __lock_acquire+0x9aa/0x1710
[40581.813871]        
[40581.813871] [<ffffffff8133fd4e>] lock_acquire+0x24e/0x5d0
[40581.813885]        
[40581.813886] [<ffffffff82c7d1d3>] mutex_lock_killable_nested+0x103/0xb90
[40581.813895]        
[40581.813896] [<ffffffff81a3f7a6>] do_io_accounting+0x186/0xcf0
[40581.813902]        
[40581.813903] [<ffffffff81a40329>] proc_tgid_io_accounting+0x19/0x20
[40581.813908]        
[40581.813909] [<ffffffff81a41494>] proc_single_show+0x114/0x1d0
[40581.813917]        
[40581.813917] [<ffffffff81915ad4>] seq_read+0x334/0x1400
[40581.813921]        
[40581.813921] [<ffffffff81884da6>] __vfs_read+0x106/0x990
[40581.813927]        
[40581.813927] [<ffffffff81886038>] vfs_read+0x118/0x400
[40581.813931]        
[40581.813931] [<ffffffff8188aebf>] SyS_read+0xdf/0x1d0
[40581.813938]        
[40581.813938] [<ffffffff8100924d>] do_syscall_64+0x19d/0x540
[40581.813945]        
[40581.813946] [<ffffffff82c8af24>] return_from_SYSCALL_64+0x0/0x7a
[40581.813949] 
[40581.813949] -> #0
[40581.813951]  (
[40581.813954] &p->lock
[40581.813955] ){+.+.+.}
[40581.813956] :
[40581.813961]        
[40581.813961] [<ffffffff81337938>] validate_chain.isra.31+0x2b28/0x4c00
[40581.813965]        
[40581.813966] [<ffffffff8133dcda>] __lock_acquire+0x9aa/0x1710
[40581.813972]        
[40581.813972] [<ffffffff8133fd4e>] lock_acquire+0x24e/0x5d0
[40581.813977]        
[40581.813977] [<ffffffff82c7f2f8>] mutex_lock_nested+0x108/0xa50
[40581.813983]        
[40581.813983] [<ffffffff8191588c>] seq_read+0xec/0x1400
[40581.813993]        
[40581.813993] [<ffffffff81a7bdde>] kernfs_fop_read+0x35e/0x640
[40581.813998]        
[40581.813998] [<ffffffff818812ef>] do_loop_readv_writev+0xdf/0x250
[40581.814003]        
[40581.814003] [<ffffffff81886fb5>] do_readv_writev+0x6a5/0xab0
[40581.814007]        
[40581.814007] [<ffffffff81887446>] vfs_readv+0x86/0xe0
[40581.814020]        
[40581.814020] [<ffffffff8194fdac>] default_file_splice_read+0x49c/0xbb0
[40581.814026]        
[40581.814027] [<ffffffff8194eb74>] do_splice_to+0x104/0x1a0
[40581.814033]        
[40581.814033] [<ffffffff8194ee80>] splice_direct_to_actor+0x270/0xa00
[40581.814039]        
[40581.814039] [<ffffffff8194f7a4>] do_splice_direct+0x194/0x300
[40581.814046]        
[40581.814046] [<ffffffff818896e9>] do_sendfile+0x469/0x1270
[40581.814051]        
[40581.814051] [<ffffffff8188bcb0>] SyS_sendfile64+0x140/0x150
[40581.814054]        
[40581.814055] [<ffffffff8100924d>] do_syscall_64+0x19d/0x540
[40581.814059]        
[40581.814060] [<ffffffff82c8af24>] return_from_SYSCALL_64+0x0/0x7a
[40581.814062] 
[40581.814062] other info that might help us debug this:
[40581.814062] 
[40581.814066] Chain exists of:
[40581.814066]   
[40581.814067] &p->lock
[40581.814069]  --> 
[40581.814070] &sb->s_type->i_mutex_key
[40581.814071] #17
[40581.814073]  --> 
[40581.814076] sb_writers
[40581.814079] #8
[40581.814079] 
[40581.814079] 
[40581.814080]  Possible unsafe locking scenario:
[40581.814080] 
[40581.814081]        CPU0                    CPU1
[40581.814083]        ----                    ----
[40581.814085]   lock(
[40581.814088] sb_writers
[40581.814089] #8
[40581.814089] );
[40581.814091]                                lock(
[40581.814093] &sb->s_type->i_mutex_key
[40581.814095] #17
[40581.814095] );
[40581.814097]                                lock(
[40581.814098] sb_writers
[40581.814099] #8
[40581.814099] );
[40581.814101]   lock(
[40581.814103] &p->lock
[40581.814103] );
[40581.814104] 
[40581.814104]  *** DEADLOCK ***
[40581.814104] 
[40581.814106] 1 lock held by trinity-c104/39795:
[40581.814109]  #0: 
[40581.814111]  (
[40581.814112] sb_writers
[40581.814113] #8
[40581.814114] ){.+.+.+}
[40581.814116] , at: 
[40581.814117] [<ffffffff81889c6a>] do_sendfile+0x9ea/0x1270
[40581.814118] 
[40581.814118] stack backtrace:
[40581.814121] CPU: 25 PID: 39795 Comm: trinity-c104 Tainted: G        W       4.9.0-rc1-lockfix-uncorev2+ #51
[40581.814123] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRRFSDP1.86B.0271.R00.1510301446 10/30/2015
[40581.814131]  ffff880825886da0 ffffffff81d37124 0000000041b58ab3 ffffffff83348dc7
[40581.814138]  ffffffff81d37064 0000000000000001 0000000000000000 ffff8807b4d5d5d8
[40581.814145]  00000000bfc018be ffff880825886d78 0000000000000001 0000000000000000
[40581.814146] Call Trace:
[40581.814155]  [<ffffffff81d37124>] dump_stack+0xc0/0x12c
[40581.814159]  [<ffffffff81d37064>] ? _atomic_dec_and_lock+0xc4/0xc4
[40581.814168]  [<ffffffff81332fa9>] print_circular_bug+0x3c9/0x5e0
[40581.814171]  [<ffffffff81332be0>] ? print_circular_bug_entry+0xd0/0xd0
[40581.814176]  [<ffffffff81337938>] validate_chain.isra.31+0x2b28/0x4c00
[40581.814182]  [<ffffffff81334e10>] ? check_irq_usage+0x300/0x300
[40581.814192]  [<ffffffff81334e10>] ? check_irq_usage+0x300/0x300
[40581.814196]  [<ffffffff81de21f3>] ? __this_cpu_preempt_check+0x13/0x20
[40581.814200]  [<ffffffff81336045>] ? validate_chain.isra.31+0x1235/0x4c00
[40581.814204]  [<ffffffff8133a4d0>] ? print_usage_bug+0x700/0x700
[40581.814208]  [<ffffffff812abdc0>] ? sched_clock_cpu+0x1b0/0x310
[40581.814214]  [<ffffffff8133a4d0>] ? print_usage_bug+0x700/0x700
[40581.814219]  [<ffffffff812abdc0>] ? sched_clock_cpu+0x1b0/0x310
[40581.814226]  [<ffffffff8133dcda>] __lock_acquire+0x9aa/0x1710
[40581.814232]  [<ffffffff8133fd4e>] lock_acquire+0x24e/0x5d0
[40581.814235]  [<ffffffff8191588c>] ? seq_read+0xec/0x1400
[40581.814240]  [<ffffffff8191588c>] ? seq_read+0xec/0x1400
[40581.814243]  [<ffffffff82c7f2f8>] mutex_lock_nested+0x108/0xa50
[40581.814246]  [<ffffffff8191588c>] ? seq_read+0xec/0x1400
[40581.814250]  [<ffffffff8191588c>] ? seq_read+0xec/0x1400
[40581.814256]  [<ffffffff817fedd6>] ? kasan_unpoison_shadow+0x36/0x50
[40581.814259]  [<ffffffff82c7f1f0>] ? mutex_lock_interruptible_nested+0xb40/0xb40
[40581.814264]  [<ffffffff8168ec2c>] ? get_page_from_freelist+0x175c/0x2ed0
[40581.814271]  [<ffffffff8168d4d0>] ? __isolate_free_page+0x7e0/0x7e0
[40581.814275]  [<ffffffff8133c3f9>] ? mark_held_locks+0x109/0x290
[40581.814278]  [<ffffffff8191588c>] seq_read+0xec/0x1400
[40581.814283]  [<ffffffff813ac01d>] ? rcu_lockdep_current_cpu_online+0x11d/0x1d0
[40581.814290]  [<ffffffff819157a0>] ? seq_hlist_start_percpu+0x4a0/0x4a0
[40581.814295]  [<ffffffff8198ef20>] ? __fsnotify_update_child_dentry_flags.part.0+0x2b0/0x2b0
[40581.814298]  [<ffffffff81de21f3>] ? __this_cpu_preempt_check+0x13/0x20
[40581.814300]  [<ffffffff81a7bdde>] kernfs_fop_read+0x35e/0x640
[40581.814305]  [<ffffffff81b49a55>] ? selinux_file_permission+0x3c5/0x550
[40581.814310]  [<ffffffff81a7ba80>] ? kernfs_fop_open+0xf40/0xf40
[40581.814312]  [<ffffffff818812ef>] do_loop_readv_writev+0xdf/0x250
[40581.814318]  [<ffffffff81886fb5>] do_readv_writev+0x6a5/0xab0
[40581.814324]  [<ffffffff81886910>] ? vfs_write+0x5f0/0x5f0
[40581.814328]  [<ffffffff81d8fbaf>] ? iov_iter_get_pages_alloc+0x53f/0x1990
[40581.814332]  [<ffffffff81d8f670>] ? iov_iter_npages+0xed0/0xed0
[40581.814336]  [<ffffffff8133c3f9>] ? mark_held_locks+0x109/0x290
[40581.814339]  [<ffffffff81de21f3>] ? __this_cpu_preempt_check+0x13/0x20
[40581.814344]  [<ffffffff8133caa0>] ? trace_hardirqs_on_caller+0x520/0x720
[40581.814347]  [<ffffffff81887446>] vfs_readv+0x86/0xe0
[40581.814352]  [<ffffffff8194fdac>] default_file_splice_read+0x49c/0xbb0
[40581.814361]  [<ffffffff8194f910>] ? do_splice_direct+0x300/0x300
[40581.814363]  [<ffffffff817fef3d>] ? kasan_kmalloc+0xad/0xe0
[40581.814366]  [<ffffffff818a6287>] ? alloc_pipe_info+0x1b7/0x410
[40581.814371]  [<ffffffff8133a4d0>] ? print_usage_bug+0x700/0x700
[40581.814373]  [<ffffffff8188bcb0>] ? SyS_sendfile64+0x140/0x150
[40581.814377]  [<ffffffff8100924d>] ? do_syscall_64+0x19d/0x540
[40581.814380]  [<ffffffff82c8af24>] ? entry_SYSCALL64_slow_path+0x25/0x25
[40581.814382]  [<ffffffff812abdc0>] ? sched_clock_cpu+0x1b0/0x310
[40581.814386]  [<ffffffff8133c3f9>] ? mark_held_locks+0x109/0x290
[40581.814390]  [<ffffffff8133caa0>] ? trace_hardirqs_on_caller+0x520/0x720
[40581.814395]  [<ffffffff8198ef20>] ? __fsnotify_update_child_dentry_flags.part.0+0x2b0/0x2b0
[40581.814398]  [<ffffffff81b49a55>] ? selinux_file_permission+0x3c5/0x550
[40581.814404]  [<ffffffff81b26e96>] ? security_file_permission+0x176/0x220
[40581.814408]  [<ffffffff81885c78>] ? rw_verify_area+0xd8/0x380
[40581.814411]  [<ffffffff8194eb74>] do_splice_to+0x104/0x1a0
[40581.814415]  [<ffffffff818a63b7>] ? alloc_pipe_info+0x2e7/0x410
[40581.814419]  [<ffffffff8194ee80>] splice_direct_to_actor+0x270/0xa00
[40581.814424]  [<ffffffff8194c5e0>] ? wakeup_pipe_readers+0x90/0x90
[40581.814429]  [<ffffffff8194ec10>] ? do_splice_to+0x1a0/0x1a0
[40581.814432]  [<ffffffff81885c78>] ? rw_verify_area+0xd8/0x380
[40581.814438]  [<ffffffff8194f7a4>] do_splice_direct+0x194/0x300
[40581.814443]  [<ffffffff8194f610>] ? splice_direct_to_actor+0xa00/0xa00
[40581.814450]  [<ffffffff81278cee>] ? preempt_count_sub+0x5e/0xe0
[40581.814452]  [<ffffffff81890415>] ? __sb_start_write+0x145/0x360
[40581.814457]  [<ffffffff818896e9>] do_sendfile+0x469/0x1270
[40581.814461]  [<ffffffff81889280>] ? do_compat_pwritev64.isra.16+0xd0/0xd0
[40581.814466]  [<ffffffff814cb287>] ? __audit_syscall_exit+0x637/0x960
[40581.814469]  [<ffffffff81006afb>] ? syscall_trace_enter+0x89b/0x1930
[40581.814473]  [<ffffffff817f7993>] ? kfree+0x3f3/0x620
[40581.814477]  [<ffffffff8188bcb0>] SyS_sendfile64+0x140/0x150
[40581.814479]  [<ffffffff8188bb70>] ? SyS_sendfile+0x140/0x140
[40581.814482]  [<ffffffff81de21f3>] ? __this_cpu_preempt_check+0x13/0x20
[40581.814485]  [<ffffffff8188bb70>] ? SyS_sendfile+0x140/0x140
[40581.814487]  [<ffffffff8100924d>] do_syscall_64+0x19d/0x540
[40581.814491]  [<ffffffff82c8af24>] entry_SYSCALL64_slow_path+0x25/0x25
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux