Re: locks_remove_file() -> flock_lock_inode() sleeps in invalid context, false positive due to NULL dereference ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 7, 2018 at 7:22 PM, Rafael David Tinoco
<rafael.tinoco@xxxxxxxxxx> wrote:
> During our functional tests, we have ran into the following issue for i686:
>
> [   19.301067] BUG: unable to handle kernel NULL pointer dereference at 0000028a
>
> which might have triggered false positive for:
>
> [   48.544348] BUG: sleeping function called from invalid context at
> /srv/oe/build/tmp-rpb-glibc/work-shared/intel-core2-32/kernel-source/include/linux/percpu-rwsem.h:34
>
> For latest 4.20.0-rc1-next-20181107, when running fsync01 LTP test in
> this kernel.
>
> ----
>
> [   19.120300] BUG: unable to handle kernel NULL pointer dereference at 0000028a
> [   19.127433] *pde = 00000000
> [   19.130311] Oops: 0000 [#1] SMP
> [   19.133449] CPU: 2 PID: 2557 Comm: ata_id Not tainted
> 4.20.0-rc1-next-20181107 #1
> [   19.140920] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> 2.0b 07/27/2017
> [   19.148391] EIP: locks_remove_flock+0x60/0xb0
> [   19.152749] Code: 10 e8 b4 d3 ff ff 8b 43 14 83 4d 80 40 8b 70 58
> 85 f6 74 46 8d 8d 58 ff ff ff ba 07 00 00 00 89 d8 ff d6 8b 45 d8 85
> c0 74 0f <8b> 50 04 85 d2 74 08 8d 85 58 ff ff ff ff d2 8b 45 f0 65 33
> 05 14
> [   19.171484] EAX: 00000286 EBX: f501c140 ECX: 00000000 EDX: ffffffff
> [   19.177742] ESI: 00000000 EDI: f53978c8 EBP: f37dff14 ESP: f37dfe6c
> [   19.183999] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010202
> [   19.190776] CR0: 80050033 CR2: 0000028a CR3: 343f9000 CR4: 003406d0
> [   19.197035] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> [   19.203299] DR6: fffe0ff0 DR7: 00000400
> [   19.207152] Call Trace:
> [   19.209604]  ? lockdep_hardirqs_on+0xe4/0x190
> [   19.213961]  ? ktime_get_coarse_real_ts64+0x4d/0xd0
> [   19.218838]  ? trace_hardirqs_on+0x43/0xe0
> [   19.222930]  ? __audit_syscall_entry+0xb2/0xf0
> [   19.227375]  locks_remove_file+0x3e/0x1e0
> [   19.231380]  __fput+0xc2/0x1f0
> [   19.234440]  ____fput+0xd/0x10
> [   19.237499]  task_work_run+0x7f/0xb0
> [   19.241078]  exit_to_usermode_loop+0x9d/0xa0
> [   19.245349]  do_fast_syscall_32+0x257/0x2a0
> [   19.249527]  entry_SYSENTER_32+0x70/0xc8
> [   19.253444] EIP: 0xb7ee9991
> [   19.256234] Code: 8b 98 60 cd ff ff 85 d2 89 c8 74 02 89 0a 5b 5d
> c3 8b 04 24 c3 8b 14 24 c3 8b 1c 24 c3 8b 3c 24 c3 51 52 55 89 e5 0f
> 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90
> 8d 76
> [   19.274971] EAX: 00000000 EBX: 00000004 ECX: 00000043 EDX: b7e9a894
> [   19.281228] ESI: b7eb8000 EDI: 00000016 EBP: b7ccc6b0 ESP: bf9c7830
> [   19.287485] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000246
> [   19.294263] Modules linked in: fuse
> [   19.297754] CR2: 000000000000028a
> [   19.301066] ---[ end trace 30875d5239ac018c ]---
>
> And, with EIP: locks_remove_flock+0x60/0xb0, we have:
>
> c12777c0 <locks_remove_flock>:
> c12777c0:       e8 eb 26 dd ff          call   c1049eb0 <__fentry__>
> c12777c5:       55                      push   %ebp
> c12777c6:       83 c2 20                add    $0x20,%edx
> c12777c9:       89 e5                   mov    %esp,%ebp
> c12777cb:       57                      push   %edi
> c12777cc:       56                      push   %esi
> c12777cd:       53                      push   %ebx
> c12777ce:       89 c3                   mov    %eax,%ebx
> c12777d0:       81 ec 9c 00 00 00       sub    $0x9c,%esp
> c12777d6:       65 a1 14 00 00 00       mov    %gs:0x14,%eax
> c12777dc:       89 45 f0                mov    %eax,-0x10(%ebp)
> c12777df:       31 c0                   xor    %eax,%eax
> c12777e1:       8b 02                   mov    (%edx),%eax
> c12777e3:       39 c2                   cmp    %eax,%edx
> c12777e5:       74 48                   je     c127782f
> <locks_remove_flock+0x6f>
> c12777e7:       8d 8d 58 ff ff ff       lea    -0xa8(%ebp),%ecx
> c12777ed:       ba 08 00 00 00          mov    $0x8,%edx
> c12777f2:       89 d8                   mov    %ebx,%eax
> c12777f4:       8b 7b 10                mov    0x10(%ebx),%edi
> c12777f7:       e8 b4 d3 ff ff          call   c1274bb0 <flock_make_lock>
> c12777fc:       8b 43 14                mov    0x14(%ebx),%eax
> -------------- EIP
> c12777ff:       83 4d 80 40             orl    $0x40,-0x80(%ebp)
> c1277803:       8b 70 58                mov    0x58(%eax),%esi
> c1277806:       85 f6                   test   %esi,%esi
>
> ----
>
> Pointing to:
>
> locks_remove_flock(struct file *filp, struct file_lock_context *flctx)
> {
>     struct file_lock fl;
> ...
>     flock_make_lock(filp, LOCK_UN, &fl);
> ...
> }
>
> Since flock_make_lock() calls locks_alloc_lock() and kmem_cache_zalloc()...
>
> shouldn't "fl" be declared as *fl pointer ?

NM for this one, just saw flock_make_lock() can return a ptr to struct
file_lock *, after creating it from slab, or just populate a stack
variable, like it is doing here.

For:

...
flock_make_lock(filp, LOCK_UN, &fl);
fl.fl_flags |= FL_CLOSE;
...

I wonder if, for x86, we are just missing an initialization:

memset(&fl, 0, sizeof(struct file_lock));

in the beginning of locks_remove_flock().

If not, then could filp->f_op->flock() (NFSv4 for this test) be
playing tricks with a "fl" coming from the stack ?

> ----
>
> Link: https://bugs.linaro.org/show_bug.cgi?id=4056
> Full dmesg: https://lkft.validation.linaro.org/scheduler/job/497741#L964



[Index of Archives]     [Linux Kernel]     [Linux USB Development]     [Yosemite News]     [Linux SCSI]

  Powered by Linux