Re: local DoS - systemd hang or timeout (WAS: Re: [RFC][CFT] splice_read reworked)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




----- Original Message -----
> From: "CAI Qian" <caiqian@xxxxxxxxxx>
> To: "tj" <tj@xxxxxxxxxx>
> Cc: "Al Viro" <viro@xxxxxxxxxxxxxxxxxx>, "Linus Torvalds" <torvalds@xxxxxxxxxxxxxxxxxxxx>, "Dave Chinner"
> <david@xxxxxxxxxxxxx>, "linux-xfs" <linux-xfs@xxxxxxxxxxxxxxx>, "Jens Axboe" <axboe@xxxxxxxxx>, "Nick Piggin"
> <npiggin@xxxxxxxxx>, linux-fsdevel@xxxxxxxxxxxxxxx
> Sent: Wednesday, October 5, 2016 11:54:48 AM
> Subject: Re: local DoS - systemd hang or timeout (WAS: Re: [RFC][CFT] splice_read reworked)
> 
> 
> 
> ----- Original Message -----
> > From: "tj" <tj@xxxxxxxxxx>
> > To: "CAI Qian" <caiqian@xxxxxxxxxx>
> > Cc: "Al Viro" <viro@xxxxxxxxxxxxxxxxxx>, "Linus Torvalds"
> > <torvalds@xxxxxxxxxxxxxxxxxxxx>, "Dave Chinner"
> > <david@xxxxxxxxxxxxx>, "linux-xfs" <linux-xfs@xxxxxxxxxxxxxxx>, "Jens
> > Axboe" <axboe@xxxxxxxxx>, "Nick Piggin"
> > <npiggin@xxxxxxxxx>, linux-fsdevel@xxxxxxxxxxxxxxx
> > Sent: Wednesday, October 5, 2016 11:30:14 AM
> > Subject: Re: local DoS - systemd hang or timeout (WAS: Re: [RFC][CFT]
> > splice_read reworked)
> > 
> > Hello, CAI.
> > 
> > On Wed, Oct 05, 2016 at 10:09:39AM -0400, CAI Qian wrote:
> > > > This one seems to be the offender.  cgroup is trying to offline a
> > > > cpuset css, which takes place under cgroup_mutex.  The offlining ends
> > > > up trying to drain active usages of a sysctl table which apprently is
> > > > not happening.  Did something hang or crash while trying to generate
> > > > sysctl content?
> > >
> > > Hmm, I am not sure, since the trinity was running from an non-privileged
> > > user which can only read content from /proc or /sys.
> > 
> > So, userland, priviledged or not, can't cause this.  The ref is held
> > only while the kernel code is operating to generate content or
> > iterating, which shouldn't be affected by userland actions.  This is
> > caused by kernel code hanging or crashing while holding a ref.
> Right, the trinity calls many different random syscalls and options on those
> /proc/ and /sys/ files and generate lots of different errno. It is likely
> some of error-path out there causes hang or crash.
Tejun,

Not sure if this related, and there is always a lockdep regards procfs happened
below unless masking by other lockdep issues before the cgroup hang. Also, this
hang is always reproducible.

[ 4787.875980] 
[ 4787.877645] ======================================================
[ 4787.884540] [ INFO: possible circular locking dependency detected ]
[ 4787.891533] 4.8.0-rc8-usrns-scale+ #8 Tainted: G        W      
[ 4787.898138] -------------------------------------------------------
[ 4787.905130] trinity-c116/106905 is trying to acquire lock:
[ 4787.911251]  (&p->lock){+.+.+.}, at: [<ffffffff812aca8c>] seq_read+0x4c/0x3e0
[ 4787.919264] 
[ 4787.919264] but task is already holding lock:
[ 4787.925773]  (sb_writers#8){.+.+.+}, at: [<ffffffff81284367>] __sb_start_write+0xb7/0xf0
[ 4787.934854] 
[ 4787.934854] which lock already depends on the new lock.
[ 4787.934854] 
[ 4787.943981] 
[ 4787.943981] the existing dependency chain (in reverse order) is:
[ 4787.952333] 
-> #3 (sb_writers#8){.+.+.+}:
[ 4787.957050]        [<ffffffff810fd711>] __lock_acquire+0x3f1/0x7f0
[ 4787.963960]        [<ffffffff810fe166>] lock_acquire+0xd6/0x240
[ 4787.970577]        [<ffffffff810f769a>] percpu_down_read+0x4a/0xa0
[ 4787.977487]        [<ffffffff81284367>] __sb_start_write+0xb7/0xf0
[ 4787.984395]        [<ffffffff812a8974>] mnt_want_write+0x24/0x50
[ 4787.991110]        [<ffffffffa05049af>] ovl_want_write+0x1f/0x30 [overlay]
[ 4787.998799]        [<ffffffffa05070c2>] ovl_do_remove+0x42/0x4a0 [overlay]
[ 4788.006483]        [<ffffffffa0507536>] ovl_rmdir+0x16/0x20 [overlay]
[ 4788.013682]        [<ffffffff8128d357>] vfs_rmdir+0xb7/0x130
[ 4788.020009]        [<ffffffff81292ed3>] do_rmdir+0x183/0x1f0
[ 4788.026335]        [<ffffffff81293cf2>] SyS_unlinkat+0x22/0x30
[ 4788.032853]        [<ffffffff81003f8c>] do_syscall_64+0x6c/0x1e0
[ 4788.039576]        [<ffffffff817d927f>] return_from_SYSCALL_64+0x0/0x7a
[ 4788.046962] 
-> #2 (&sb->s_type->i_mutex_key#16){++++++}:
[ 4788.053140]        [<ffffffff810fd711>] __lock_acquire+0x3f1/0x7f0
[ 4788.060049]        [<ffffffff810fe166>] lock_acquire+0xd6/0x240
[ 4788.066664]        [<ffffffff817d60e7>] down_read+0x47/0x70
[ 4788.072893]        [<ffffffff8128ce79>] lookup_slow+0xc9/0x200
[ 4788.079410]        [<ffffffff81290b9c>] walk_component+0x1ec/0x310
[ 4788.086315]        [<ffffffff81290e5f>] link_path_walk+0x19f/0x5f0
[ 4788.093219]        [<ffffffff8129151d>] path_openat+0xdd/0xb80
[ 4788.099748]        [<ffffffff81293511>] do_filp_open+0x91/0x100
[ 4788.106362]        [<ffffffff81286f56>] do_open_execat+0x76/0x180
[ 4788.113186]        [<ffffffff8128747b>] open_exec+0x2b/0x50
[ 4788.119404]        [<ffffffff812ec61d>] load_elf_binary+0x28d/0x1120
[ 4788.126511]        [<ffffffff81288487>] search_binary_handler+0x97/0x1c0
[ 4788.134002]        [<ffffffff81289619>] do_execveat_common.isra.36+0x6a9/0x9f0
[ 4788.142071]        [<ffffffff81289c4a>] SyS_execve+0x3a/0x50
[ 4788.148398]        [<ffffffff81003f8c>] do_syscall_64+0x6c/0x1e0
[ 4788.155110]        [<ffffffff817d927f>] return_from_SYSCALL_64+0x0/0x7a
[ 4788.162502] 
-> #1 (&sig->cred_guard_mutex){+.+.+.}:
[ 4788.168179]        [<ffffffff810fd711>] __lock_acquire+0x3f1/0x7f0
[ 4788.175085]        [<ffffffff810fe166>] lock_acquire+0xd6/0x240
[ 4788.181712]        [<ffffffff817d4557>] mutex_lock_killable_nested+0x87/0x500
[ 4788.189695]        [<ffffffff81099599>] mm_access+0x29/0xa0
[ 4788.195924]        [<ffffffff81302b6c>] proc_pid_auxv+0x1c/0x70
[ 4788.202540]        [<ffffffff813039d0>] proc_single_show+0x50/0x90
[ 4788.209445]        [<ffffffff812acb48>] seq_read+0x108/0x3e0
[ 4788.215774]        [<ffffffff8127fb07>] __vfs_read+0x37/0x150
[ 4788.222198]        [<ffffffff81280d35>] vfs_read+0x95/0x140
[ 4788.228425]        [<ffffffff81282268>] SyS_read+0x58/0xc0
[ 4788.234557]        [<ffffffff81003f8c>] do_syscall_64+0x6c/0x1e0
[ 4788.241268]        [<ffffffff817d927f>] return_from_SYSCALL_64+0x0/0x7a
[ 4788.248660] 
-> #0 (&p->lock){+.+.+.}:
[ 4788.252987]        [<ffffffff810fc062>] validate_chain.isra.37+0xe72/0x1150
[ 4788.260769]        [<ffffffff810fd711>] __lock_acquire+0x3f1/0x7f0
[ 4788.267676]        [<ffffffff810fe166>] lock_acquire+0xd6/0x240
[ 4788.274302]        [<ffffffff817d3807>] mutex_lock_nested+0x77/0x430
[ 4788.281406]        [<ffffffff812aca8c>] seq_read+0x4c/0x3e0
[ 4788.287633]        [<ffffffff81316b39>] kernfs_fop_read+0x129/0x1b0
[ 4788.294659]        [<ffffffff8127fca3>] do_loop_readv_writev+0x83/0xc0
[ 4788.301954]        [<ffffffff812811a8>] do_readv_writev+0x218/0x240
[ 4788.308959]        [<ffffffff81281209>] vfs_readv+0x39/0x50
[ 4788.315188]        [<ffffffff812bc6b1>] default_file_splice_read+0x1a1/0x2b0
[ 4788.323070]        [<ffffffff812bc206>] do_splice_to+0x76/0x90
[ 4788.329587]        [<ffffffff812bc2db>] splice_direct_to_actor+0xbb/0x220
[ 4788.337173]        [<ffffffff812bc4d8>] do_splice_direct+0x98/0xd0
[ 4788.344078]        [<ffffffff81281dd1>] do_sendfile+0x1d1/0x3b0
[ 4788.350694]        [<ffffffff812829c9>] SyS_sendfile64+0xc9/0xd0
[ 4788.357405]        [<ffffffff81003f8c>] do_syscall_64+0x6c/0x1e0
[ 4788.364119]        [<ffffffff817d927f>] return_from_SYSCALL_64+0x0/0x7a
[ 4788.371511] 
[ 4788.371511] other info that might help us debug this:
[ 4788.371511] 
[ 4788.380443] Chain exists of:
  &p->lock --> &sb->s_type->i_mutex_key#16 --> sb_writers#8

[ 4788.389881]  Possible unsafe locking scenario:
[ 4788.389881] 
[ 4788.396497]        CPU0                    CPU1
[ 4788.401549]        ----                    ----
[ 4788.406614]   lock(sb_writers#8);
[ 4788.410352]                                lock(&sb->s_type->i_mutex_key#16);
[ 4788.418354]                                lock(sb_writers#8);
[ 4788.424902]   lock(&p->lock);
[ 4788.428229] 
[ 4788.428229]  *** DEADLOCK ***
[ 4788.428229] 
[ 4788.434836] 1 lock held by trinity-c116/106905:
[ 4788.439888]  #0:  (sb_writers#8){.+.+.+}, at: [<ffffffff81284367>] __sb_start_write+0xb7/0xf0
[ 4788.449473] 
[ 4788.449473] stack backtrace:
[ 4788.454334] CPU: 16 PID: 106905 Comm: trinity-c116 Tainted: G        W       4.8.0-rc8-usrns-scale+ #8
[ 4788.464719] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRNDSDP1.86B.0044.R00.1501191641 01/19/2015
[ 4788.476076]  0000000000000086 00000000cbfc6314 ffff8803ce78b760 ffffffff813d5e93
[ 4788.484371]  ffffffff82a3fbd0 ffffffff82a94890 ffff8803ce78b7a0 ffffffff810fa6ec
[ 4788.492663]  ffff8803ce78b7e0 ffff8802ead08000 0000000000000001 ffff8802ead08ca0
[ 4788.500966] Call Trace:
[ 4788.503694]  [<ffffffff813d5e93>] dump_stack+0x85/0xc2
[ 4788.509426]  [<ffffffff810fa6ec>] print_circular_bug+0x1ec/0x260
[ 4788.516128]  [<ffffffff810fc062>] validate_chain.isra.37+0xe72/0x1150
[ 4788.523319]  [<ffffffff811d4491>] ? ___perf_sw_event+0x171/0x290
[ 4788.530022]  [<ffffffff810fd711>] __lock_acquire+0x3f1/0x7f0
[ 4788.536335]  [<ffffffff810fe166>] lock_acquire+0xd6/0x240
[ 4788.542359]  [<ffffffff812aca8c>] ? seq_read+0x4c/0x3e0
[ 4788.548188]  [<ffffffff812aca8c>] ? seq_read+0x4c/0x3e0
[ 4788.554019]  [<ffffffff817d3807>] mutex_lock_nested+0x77/0x430
[ 4788.560528]  [<ffffffff812aca8c>] ? seq_read+0x4c/0x3e0
[ 4788.566358]  [<ffffffff812aca8c>] seq_read+0x4c/0x3e0
[ 4788.571995]  [<ffffffff81316a10>] ? kernfs_fop_open+0x3a0/0x3a0
[ 4788.578600]  [<ffffffff81316b39>] kernfs_fop_read+0x129/0x1b0
[ 4788.585012]  [<ffffffff81316a10>] ? kernfs_fop_open+0x3a0/0x3a0
[ 4788.591617]  [<ffffffff8127fca3>] do_loop_readv_writev+0x83/0xc0
[ 4788.598318]  [<ffffffff81316a10>] ? kernfs_fop_open+0x3a0/0x3a0
[ 4788.604924]  [<ffffffff812811a8>] do_readv_writev+0x218/0x240
[ 4788.611347]  [<ffffffff813e9535>] ? push_pipe+0xd5/0x190
[ 4788.617278]  [<ffffffff813ecec0>] ? iov_iter_get_pages_alloc+0x250/0x400
[ 4788.624746]  [<ffffffff81281209>] vfs_readv+0x39/0x50
[ 4788.630381]  [<ffffffff812bc6b1>] default_file_splice_read+0x1a1/0x2b0
[ 4788.637668]  [<ffffffff8134ae20>] ? security_file_permission+0xa0/0xc0
[ 4788.644954]  [<ffffffff812bc206>] do_splice_to+0x76/0x90
[ 4788.650880]  [<ffffffff812bc2db>] splice_direct_to_actor+0xbb/0x220
[ 4788.657872]  [<ffffffff812bba80>] ? generic_pipe_buf_nosteal+0x10/0x10
[ 4788.665157]  [<ffffffff812bc4d8>] do_splice_direct+0x98/0xd0
[ 4788.671472]  [<ffffffff81281dd1>] do_sendfile+0x1d1/0x3b0
[ 4788.677499]  [<ffffffff812829c9>] SyS_sendfile64+0xc9/0xd0
[ 4788.683622]  [<ffffffff81003f8c>] do_syscall_64+0x6c/0x1e0
[ 4788.689744]  [<ffffffff817d927f>] entry_SYSCALL64_slow_path+0x25/0x25
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux