On Fri, Aug 07, 2020 at 09:34:08PM +0900, Tetsuo Handa wrote: > On 2020/08/07 21:27, Al Viro wrote: > > On Fri, Aug 07, 2020 at 07:35:08PM +0900, Tetsuo Handa wrote: > >> syzbot is reporting hung task at pipe_release() [1], for for_each_bvec() from > >> iterate_bvec() from iterate_all_kinds() from iov_iter_alignment() from > >> ext4_unaligned_io() from ext4_dio_write_iter() from ext4_file_write_iter() from > >> call_write_iter() from do_iter_readv_writev() from do_iter_write() from > >> vfs_iter_write() from iter_file_splice_write() falls into infinite busy loop > >> with pipe->mutex held. > >> > >> The reason of falling into infinite busy loop is that iter_file_splice_write() > >> for some reason generates "struct bio_vec" entry with .bv_len=0 and .bv_offset=0 > >> while for_each_bvec() cannot handle .bv_len == 0. > > > > broken in 1bdc76aea115 "iov_iter: use bvec iterator to implement iterate_bvec()", > > unless I'm misreading it... I have been chasing something similar for a while as in, https://lore.kernel.org/linux-fsdevel/89F418A9-EB20-48CB-9AE0-52C700E6BB74@xxxxxx/ In my case, it seems the endless loop happens in iterate_iovec() instead where I put a debug patch here, --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -33,6 +33,7 @@ if (unlikely(!__v.iov_len)) \ continue; \ __v.iov_base = __p->iov_base; \ + printk_ratelimited("ITER_IOVEC left = %zu, n = %zu\n", left, n); \ left = (STEP); \ __v.iov_len -= left; \ skip = __v.iov_len; \ and end up seeing overflows ("n" supposes to be less than PAGE_SIZE) before the soft-lockups and a dead system, [ 4300.249180][T470195] ITER_IOVEC left = 0, n = 48566423 Thoughts? > > > > Zero-length segments are not disallowed; it's not all that hard to filter them > > out in iter_file_splice_write(), but the intent had always been to have > > iterate_all_kinds() et.al. able to cope with those. > > > > How are these pipe_buffers with ->len == 0 generated in that reproducer, BTW? > > There might be something else fishy going on... > > > > OK. Indeed writing to empty pipe which returns -EFAULT allows an empty > page to be linked to pipe's array. > > Now, I've just found a simple reproducer, and confirmed that this bug is > a local lockup DoS by unprivileged user. Please fix. > > ---------- > #define _GNU_SOURCE > #include <stdio.h> > #include <sys/types.h> > #include <sys/stat.h> > #include <fcntl.h> > #include <unistd.h> > > int main(int argc, char *argv[]) > { > static char buffer[4096]; > const int fd = open("/tmp/testfile", O_WRONLY | O_CREAT, 0600); > int pipe_fd[2] = { EOF, EOF }; > pipe(pipe_fd); > write(pipe_fd[1], NULL, 4096); > write(pipe_fd[1], buffer, 4096); > splice(pipe_fd[0], NULL, fd, NULL, 65536, 0); > return 0; > } > ---------- > > [ 125.598898][ C0] rcu: INFO: rcu_sched self-detected stall on CPU > [ 125.601072][ C0] rcu: 0-....: (20171 ticks this GP) idle=526/1/0x4000000000000000 softirq=7918/7918 fqs=5136 > [ 125.604874][ C0] (t=21006 jiffies g=9341 q=30) > [ 125.606512][ C0] NMI backtrace for cpu 0 > [ 125.607931][ C0] CPU: 0 PID: 2792 Comm: a.out Not tainted 5.8.0+ #793 > [ 125.610948][ C0] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 02/27/2020 > [ 125.614938][ C0] Call Trace: > [ 125.616049][ C0] <IRQ> > [ 125.617010][ C0] dump_stack+0x5e/0x7a > [ 125.618370][ C0] nmi_cpu_backtrace.cold.7+0x14/0x52 > [ 125.620148][ C0] ? lapic_can_unplug_cpu.cold.39+0x3a/0x3a > [ 125.622074][ C0] nmi_trigger_cpumask_backtrace+0x92/0x9f > [ 125.624154][ C0] arch_trigger_cpumask_backtrace+0x14/0x20 > [ 125.626102][ C0] rcu_dump_cpu_stacks+0xa0/0xd0 > [ 125.627919][ C0] rcu_sched_clock_irq.cold.95+0x121/0x39c > [ 125.629833][ C0] ? acct_account_cputime+0x17/0x20 > [ 125.631534][ C0] ? account_system_index_time+0x8a/0xa0 > [ 125.633422][ C0] update_process_times+0x23/0x60 > [ 125.635070][ C0] tick_sched_handle.isra.22+0x20/0x60 > [ 125.636870][ C0] tick_sched_timer+0x68/0x80 > [ 125.638403][ C0] ? tick_sched_handle.isra.22+0x60/0x60 > [ 125.640588][ C0] __hrtimer_run_queues+0xf9/0x1a0 > [ 125.642591][ C0] hrtimer_interrupt+0xfc/0x210 > [ 125.645033][ C0] __sysvec_apic_timer_interrupt+0x4c/0x60 > [ 125.647292][ C0] asm_call_on_stack+0xf/0x20 > [ 125.649192][ C0] </IRQ> > [ 125.650501][ C0] sysvec_apic_timer_interrupt+0x75/0x80 > [ 125.652900][ C0] asm_sysvec_apic_timer_interrupt+0x12/0x20 > [ 125.655487][ C0] RIP: 0010:iov_iter_copy_from_user_atomic+0x19b/0x350 > [ 125.658124][ C0] Code: 89 45 d0 48 c1 e6 06 48 03 37 4d 8d 3c 09 4c 89 cf e8 d9 e5 ff ff 48 8b 45 d0 45 39 eb 0f 87 35 01 00 00 49 8b 4a 18 4d 89 f9 <45> 29 dd 45 01 d8 75 12 eb 19 41 83 c4 01 41 29 c0 74 10 44 89 e0 > [ 125.666132][ C0] RSP: 0018:ffffa6cdc1237aa8 EFLAGS: 00000246 > [ 125.668557][ C0] RAX: 0000000000000000 RBX: 0000000000001000 RCX: ffff945035a25100 > [ 125.671576][ C0] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff945035a25100 > [ 125.674851][ C0] RBP: ffffa6cdc1237ad8 R08: 0000000000000000 R09: ffff945028a80000 > [ 125.677989][ C0] R10: ffffa6cdc1237de0 R11: 0000000000000000 R12: 0000000000000000 > [ 125.680990][ C0] R13: 0000000000001000 R14: 0000000000001000 R15: 0000000000001000 > [ 125.684006][ C0] iomap_write_actor+0xbe/0x190 > [ 125.685982][ C0] ? iomap_write_begin+0x460/0x460 > [ 125.688031][ C0] iomap_apply+0xf4/0x1a0 > [ 125.689810][ C0] ? iomap_write_begin+0x460/0x460 > [ 125.691826][ C0] iomap_file_buffered_write+0x69/0x90 > [ 125.698598][ C0] ? iomap_write_begin+0x460/0x460 > [ 125.705341][ C0] xfs_file_buffered_aio_write+0xc2/0x2c0 > [ 125.707780][ C0] xfs_file_write_iter+0xa3/0xc0 > [ 125.709802][ C0] do_iter_readv_writev+0x15b/0x1c0 > [ 125.712496][ C0] do_iter_write+0x81/0x190 > [ 125.715245][ C0] vfs_iter_write+0x14/0x20 > [ 125.717221][ C0] iter_file_splice_write+0x288/0x3e0 > [ 125.719340][ C0] do_splice_from+0x1a/0x40 > [ 125.721175][ C0] do_splice+0x2e5/0x620 > [ 125.722950][ C0] __x64_sys_splice+0x97/0x100 > [ 125.724937][ C0] do_syscall_64+0x31/0x40 > [ 125.726766][ C0] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 125.729388][ C0] RIP: 0033:0x7f7515dd91c3 > [ 125.731246][ C0] Code: 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 83 3d fd dd 2c 00 00 75 13 49 89 ca b8 13 01 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 2b d4 00 00 48 89 04 24 > [ 125.749632][ C0] RSP: 002b:00007ffd553cde18 EFLAGS: 00000246 ORIG_RAX: 0000000000000113 > [ 125.753713][ C0] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f7515dd91c3 > [ 125.756867][ C0] RDX: 0000000000000003 RSI: 0000000000000000 RDI: 0000000000000004 > [ 125.759872][ C0] RBP: 0000000000000000 R08: 0000000000010000 R09: 0000000000000000 > [ 125.763023][ C0] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004005a6 > [ 125.766225][ C0] R13: 00007ffd553cdf10 R14: 0000000000000000 R15: 0000000000000000 >