Re: [PATCH v4 00/10] tracing: introducing eventfs

Steven Rostedt <rostedt@xxxxxxxxxxx> · Wed, 19 Jul 2023 14:40:46 -0400

On Wed, 19 Jul 2023 18:37:12 +0000
Ajay Kaher <akaher@xxxxxxxxxx> wrote:

> > Here's the reproducer (of both v3 splat and the bug I'm hitting now).
> > 
> > ~# echo 'p:sock_getattr 0xffffffff9b55cef0 sk=%di' > /sys/kernel/tracing/kprobe_events
> > ~# ls /sys/kernel/debug/tracing/events/kprobes/sock_getattr/
> > ~# echo '-:sock_getattr 0xffffffff9b55cef0 sk=%di' > /sys/kernel/tracing/kprobe_events  
> 
> I tried above steps on v4 but couldn’t reproduce:
> 
> root@photon-6 [ ~/sdb/linux ]# echo 'p:sock_getattr 0xffffffff9b55cef0 sk=%di' > /sys/kernel/tracing/kprobe_events
> root@photon-6 [ ~/sdb/linux ]# ls /sys/kernel/debug/tracing/events/kprobes/sock_getattr/
> enable  filter  format  id  trigger
> root@photon-6 [ ~/sdb/linux ]# echo '-:sock_getattr 0xffffffff9b55cef0 sk=%di' > /sys/kernel/tracing/kprobe_events
> -bash: echo: write error: No such file or directory
> 
> I have doubt on call_srcu(), it may first end the grace period for parent then for child. If this is true then free_list
> will have unordered list and could cause problem.

I modified the srcu portion a bit. Will post soon. I think I got something
working.

I'm having doubt that the dput()s were needed in the eventfs_remove_rec(),
as the d_invalidate() appears to be enough. I'm still testing.

>  
> 
> > 
> > v3 gives me (and my updates too)
> > 
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 6.5.0-rc1-test+ #576 Not tainted
> > ------------------------------------------------------
> > trace-cmd/840 is trying to acquire lock:
> > ffff8881007e5de0 (&sb->s_type->i_mutex_key#5){++++}-{3:3}, at: dcache_dir_open_wrapper+0xc1/0x1b0
> > 
> > but task is already holding lock:
> > ffff888103ad7e70 (eventfs_rwsem/1){.+.+}-{3:3}, at: dcache_dir_open_wrapper+0x6f/0x1b0
> > 
> > which lock already depends on the new lock.
> > 
> > 
> > the existing dependency chain (in reverse order) is:
> >   
> > -> #1 (eventfs_rwsem/1){.+.+}-{3:3}:  
> >        down_read_nested+0x41/0x180
> >        eventfs_root_lookup+0x42/0x120
> >        __lookup_slow+0xff/0x1b0
> >        walk_component+0xdb/0x150
> >        path_lookupat+0x67/0x1a0
> >        filename_lookup+0xe4/0x1f0
> >        vfs_statx+0x9e/0x180
> >        vfs_fstatat+0x51/0x70
> >        __do_sys_newfstatat+0x3f/0x80
> >        do_syscall_64+0x3a/0xc0
> >        entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> >   
> > -> #0 (&sb->s_type->i_mutex_key#5){++++}-{3:3}:  
> >        __lock_acquire+0x165d/0x2390
> >        lock_acquire+0xd4/0x2d0
> >        down_write+0x3b/0xd0
> >        dcache_dir_open_wrapper+0xc1/0x1b0
> >        do_dentry_open+0x20c/0x510
> >        path_openat+0x7ad/0xc60
> >        do_filp_open+0xaf/0x160
> >        do_sys_openat2+0xab/0xe0
> >        __x64_sys_openat+0x6a/0xa0
> >        do_syscall_64+0x3a/0xc0
> >        entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> > 
> > other info that might help us debug this:
> > 
> >  Possible unsafe locking scenario:
> > 
> >        CPU0                    CPU1
> >        ----                    ----
> >   rlock(eventfs_rwsem/1);
> >                                lock(&sb->s_type->i_mutex_key#5);
> >                                lock(eventfs_rwsem/1);
> >   lock(&sb->s_type->i_mutex_key#5);
> > 
> >  *** DEADLOCK ***
> > 
> > 1 lock held by trace-cmd/840:
> >  #0: ffff888103ad7e70 (eventfs_rwsem/1){.+.+}-{3:3}, at: dcache_dir_open_wrapper+0x6f/0x1b0
> > 
> > stack backtrace:
> > CPU: 7 PID: 840 Comm: trace-cmd Not tainted 6.5.0-rc1-test+ #576
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> > Call Trace:
> >  <TASK>
> >  dump_stack_lvl+0x57/0x90
> >  check_noncircular+0x14b/0x160
> >  __lock_acquire+0x165d/0x2390
> >  lock_acquire+0xd4/0x2d0
> >  ? dcache_dir_open_wrapper+0xc1/0x1b0
> >  down_write+0x3b/0xd0
> >  ? dcache_dir_open_wrapper+0xc1/0x1b0
> >  dcache_dir_open_wrapper+0xc1/0x1b0
> >  ? __pfx_dcache_dir_open_wrapper+0x10/0x10
> >  do_dentry_open+0x20c/0x510
> >  path_openat+0x7ad/0xc60
> >  do_filp_open+0xaf/0x160
> >  do_sys_openat2+0xab/0xe0
> >  __x64_sys_openat+0x6a/0xa0
> >  do_syscall_64+0x3a/0xc0
> >  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> > RIP: 0033:0x7f1743267e41
> > Code: 44 24 18 31 c0 41 83 e2 40 75 3e 89 f0 25 00 00 41 00 3d 00 00 41 00 74 30 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 77 3f 48 8b 54 24 18 64 48 2b 14 25 28 00 00 00
> > RSP: 002b:00007ffec10ff5d0 EFLAGS: 00000287 ORIG_RAX: 0000000000000101
> > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1743267e41
> > RDX: 0000000000090800 RSI: 00007ffec10ffdb0 RDI: 00000000ffffff9c
> > RBP: 00007ffec10ffda0 R08: 00007ffec11003e0 R09: 0000000000000040
> > R10: 0000000000000000 R11: 0000000000000287 R12: 00007ffec11003e0
> > R13: 0000000000000040 R14: 0000000000000000 R15: 00007ffec110034b
> >  </TASK>
> >   
> 
> This is expected from v3 (just ignore as of now), if eventfs_set_ef_status_free crash not
> reproduced on v3 then it’s v4 issue.

The issue comes from fixing the above ;-)

-- Steve