On 2. Oct 2024, at 05:42, Nathan Chancellor <nathan@xxxxxxxxxx> wrote: > On Thu, Sep 26, 2024 at 02:21:42PM +0200, Thorsten Blum wrote: >> On 26. Sep 2024, at 03:46, kernel test robot <oliver.sang@xxxxxxxxx> wrote: >>> >>> Hello, >>> >>> kernel test robot noticed "WARNING:at_lib/string_helpers.c:#__fortify_report" on: >>> >>> commit: 3d2d832826325210abb9849ee96634bf5a197517 ("[PATCH] acl: Annotate struct posix_acl with __counted_by()") >>> url: https://github.com/intel-lab-lkp/linux/commits/Thorsten-Blum/acl-Annotate-struct-posix_acl-with-__counted_by/20240924-054002 >>> base: https://git.kernel.org/cgit/linux/kernel/git/vfs/vfs.git vfs.all >>> patch link: https://lore.kernel.org/all/20240923213809.235128-2-thorsten.blum@xxxxxxxxx/ >>> patch subject: [PATCH] acl: Annotate struct posix_acl with __counted_by() >>> >>> in testcase: boot >>> >>> compiler: clang-18 >>> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G >>> >>> (please refer to attached dmesg/kmsg for entire log/backtrace) >>> >>> >>> +---------------------------------------------------+------------+------------+ >>> | | c72f8f06a2 | 3d2d832826 | >>> +---------------------------------------------------+------------+------------+ >>> | boot_successes | 18 | 0 | >>> | boot_failures | 0 | 18 | >>> | WARNING:at_lib/string_helpers.c:#__fortify_report | 0 | 18 | >>> | RIP:__fortify_report | 0 | 18 | >>> | kernel_BUG_at_lib/string_helpers.c | 0 | 18 | >>> | Oops:invalid_opcode:#[##]KASAN | 0 | 18 | >>> | RIP:__fortify_panic | 0 | 18 | >>> | Kernel_panic-not_syncing:Fatal_exception | 0 | 18 | >>> +---------------------------------------------------+------------+------------+ >>> >>> >>> If you fix the issue in a separate patch/commit (i.e. not just a new version of >>> the same patch/commit), kindly add following tags >>> | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> >>> | Closes: https://lore.kernel.org/oe-lkp/202409260949.a1254989-oliver.sang@xxxxxxxxx >>> >>> >>> [ 25.825642][ T121] ------------[ cut here ]------------ >>> [ 25.826209][ T121] kmemdup: detected buffer overflow: 72 byte read of buffer size 68 >>> [ 25.826866][ T121] WARNING: CPU: 0 PID: 121 at lib/string_helpers.c:1030 __fortify_report (lib/string_helpers.c:1029) >>> [ 25.827588][ T121] Modules linked in: >>> [ 25.827895][ T121] CPU: 0 UID: 0 PID: 121 Comm: systemd-tmpfile Not tainted 6.11.0-rc6-00294-g3d2d83282632 #1 >>> [ 25.828870][ T121] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 >>> [ 25.829661][ T121] RIP: 0010:__fortify_report (lib/string_helpers.c:1029) >>> [ 25.830093][ T121] Code: f6 c5 01 49 8b 37 48 c7 c0 00 8a 56 86 48 c7 c1 20 8a 56 86 48 0f 44 c8 48 c7 c7 80 87 56 86 4c 89 f2 49 89 d8 e8 d1 37 dd fe <0f> 0b 5b 41 5e 41 5f 5d 31 c0 31 c9 31 ff 31 d2 31 f6 45 31 c0 c3 >>> All code >>> ======== >>> 0: f6 c5 01 test $0x1,%ch >>> 3: 49 8b 37 mov (%r15),%rsi >>> 6: 48 c7 c0 00 8a 56 86 mov $0xffffffff86568a00,%rax >>> d: 48 c7 c1 20 8a 56 86 mov $0xffffffff86568a20,%rcx >>> 14: 48 0f 44 c8 cmove %rax,%rcx >>> 18: 48 c7 c7 80 87 56 86 mov $0xffffffff86568780,%rdi >>> 1f: 4c 89 f2 mov %r14,%rdx >>> 22: 49 89 d8 mov %rbx,%r8 >>> 25: e8 d1 37 dd fe call 0xfffffffffedd37fb >>> 2a:* 0f 0b ud2 <-- trapping instruction >>> 2c: 5b pop %rbx >>> 2d: 41 5e pop %r14 >>> 2f: 41 5f pop %r15 >>> 31: 5d pop %rbp >>> 32: 31 c0 xor %eax,%eax >>> 34: 31 c9 xor %ecx,%ecx >>> 36: 31 ff xor %edi,%edi >>> 38: 31 d2 xor %edx,%edx >>> 3a: 31 f6 xor %esi,%esi >>> 3c: 45 31 c0 xor %r8d,%r8d >>> 3f: c3 ret >>> >>> Code starting with the faulting instruction >>> =========================================== >>> 0: 0f 0b ud2 >>> 2: 5b pop %rbx >>> 3: 41 5e pop %r14 >>> 5: 41 5f pop %r15 >>> 7: 5d pop %rbp >>> 8: 31 c0 xor %eax,%eax >>> a: 31 c9 xor %ecx,%ecx >>> c: 31 ff xor %edi,%edi >>> e: 31 d2 xor %edx,%edx >>> 10: 31 f6 xor %esi,%esi >>> 12: 45 31 c0 xor %r8d,%r8d >>> 15: c3 ret >>> [ 25.831566][ T121] RSP: 0018:ffffc90001e6f8a0 EFLAGS: 00010246 >>> [ 25.832052][ T121] RAX: 0000000000000000 RBX: 0000000000000044 RCX: 0000000000000000 >>> [ 25.832705][ T121] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 >>> [ 25.833348][ T121] RBP: 000000000000001c R08: 0000000000000000 R09: 0000000000000000 >>> [ 25.833964][ T121] R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff920003cdf27 >>> [ 25.834570][ T121] R13: dffffc0000000000 R14: 0000000000000048 R15: ffffffff86568730 >>> [ 25.835152][ T121] FS: 00007f994a290440(0000) GS:ffffffff87eba000(0000) knlGS:0000000000000000 >>> [ 25.835834][ T121] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> [ 25.836385][ T121] CR2: 0000561875daa188 CR3: 0000000160dd0000 CR4: 00000000000406f0 >>> [ 25.837005][ T121] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>> [ 25.837609][ T121] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >>> [ 25.838212][ T121] Call Trace: >>> [ 25.838482][ T121] <TASK> >>> [ 25.838717][ T121] ? __warn (kernel/panic.c:242) >>> [ 25.839053][ T121] ? __fortify_report (lib/string_helpers.c:1029) >>> [ 25.839436][ T121] ? __fortify_report (lib/string_helpers.c:1029) >>> [ 25.839816][ T121] ? report_bug (lib/bug.c:?) >>> [ 25.840206][ T121] ? handle_bug (arch/x86/kernel/traps.c:239) >>> [ 25.840551][ T121] ? exc_invalid_op (arch/x86/kernel/traps.c:260) >>> [ 25.840932][ T121] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621) >>> [ 25.841336][ T121] ? __fortify_report (lib/string_helpers.c:1029) >>> [ 25.841732][ T121] __fortify_panic (lib/string_helpers.c:1037) >>> [ 25.842094][ T121] __posix_acl_chmod (include/linux/fortify-string.h:751) >>> [ 25.842493][ T121] ? make_vfsgid (fs/mnt_idmapping.c:122) >>> [ 25.842837][ T121] ? capable_wrt_inode_uidgid (include/linux/mnt_idmapping.h:193 kernel/capability.c:480 kernel/capability.c:499) >>> [ 25.843285][ T121] posix_acl_chmod (fs/posix_acl.c:624) >>> [ 25.843671][ T121] shmem_setattr (mm/shmem.c:1240) >>> [ 25.844039][ T121] ? ns_to_timespec64 (include/linux/math64.h:29 kernel/time/time.c:529) >>> [ 25.844456][ T121] ? inode_set_ctime_current (fs/inode.c:2331) >>> [ 25.844894][ T121] ? shmem_xattr_handler_set (mm/shmem.c:1173) >>> [ 25.845322][ T121] ? rcu_read_lock_any_held (kernel/rcu/update.c:388) >>> [ 25.845736][ T121] ? security_inode_setattr (security/security.c:?) >>> [ 25.846157][ T121] ? shmem_xattr_handler_set (mm/shmem.c:1173) >>> [ 25.846582][ T121] notify_change (fs/attr.c:?) >>> [ 25.846952][ T121] chmod_common (fs/open.c:653) >>> [ 25.847315][ T121] ? __ia32_sys_chroot (fs/open.c:637) >>> [ 25.847709][ T121] ? user_path_at (fs/namei.c:3019) >>> [ 25.848068][ T121] ? kmem_cache_free (mm/slub.c:4474) >>> [ 25.848489][ T121] do_fchmodat (fs/open.c:701) >>> [ 25.848842][ T121] ? do_faccessat (fs/open.c:686) >>> [ 25.849210][ T121] ? print_irqtrace_events (kernel/locking/lockdep.c:4311) >>> [ 25.849640][ T121] __x64_sys_chmod (fs/open.c:725 fs/open.c:723 fs/open.c:723) >>> [ 25.850010][ T121] do_syscall_64 (arch/x86/entry/common.c:?) >>> [ 25.850371][ T121] ? tracer_hardirqs_on (kernel/trace/trace_irqsoff.c:?) >>> [ 25.850790][ T121] ? tracer_hardirqs_off (kernel/trace/trace_irqsoff.c:?) >>> [ 25.851208][ T121] ? tracer_hardirqs_on (kernel/trace/trace_irqsoff.c:?) >>> [ 25.851621][ T121] ? tracer_hardirqs_on (kernel/trace/trace_irqsoff.c:619) >>> [ 25.852040][ T121] ? print_irqtrace_events (kernel/locking/lockdep.c:4311) >>> [ 25.852522][ T121] ? do_syscall_64 (arch/x86/entry/common.c:102) >>> [ 25.852917][ T121] ? do_syscall_64 (arch/x86/entry/common.c:102) >>> [ 25.853285][ T121] ? do_syscall_64 (arch/x86/entry/common.c:102) >>> [ 25.853667][ T121] ? do_syscall_64 (arch/x86/entry/common.c:102) >>> [ 25.854053][ T121] ? exc_page_fault (arch/x86/mm/fault.c:1543) >>> [ 25.854445][ T121] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) >>> [ 25.854921][ T121] RIP: 0033:0x7f994adcdc47 >>> [ 25.855282][ T121] Code: eb d9 e8 9c 04 02 00 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 5f 00 00 00 0f 05 c3 0f 1f 84 00 00 00 00 00 b8 5a 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 89 a1 0d 00 f7 d8 64 89 01 48 >>> All code >>> ======== >>> 0: eb d9 jmp 0xffffffffffffffdb >>> 2: e8 9c 04 02 00 call 0x204a3 >>> 7: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1) >>> e: 00 00 00 >>> 11: 66 90 xchg %ax,%ax >>> 13: b8 5f 00 00 00 mov $0x5f,%eax >>> 18: 0f 05 syscall >>> 1a: c3 ret >>> 1b: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1) >>> 22: 00 >>> 23: b8 5a 00 00 00 mov $0x5a,%eax >>> 28: 0f 05 syscall >>> 2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction >>> 30: 73 01 jae 0x33 >>> 32: c3 ret >>> 33: 48 8b 0d 89 a1 0d 00 mov 0xda189(%rip),%rcx # 0xda1c3 >>> 3a: f7 d8 neg %eax >>> 3c: 64 89 01 mov %eax,%fs:(%rcx) >>> 3f: 48 rex.W >>> >>> Code starting with the faulting instruction >>> =========================================== >>> 0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax >>> 6: 73 01 jae 0x9 >>> 8: c3 ret >>> 9: 48 8b 0d 89 a1 0d 00 mov 0xda189(%rip),%rcx # 0xda199 >>> 10: f7 d8 neg %eax >>> 12: 64 89 01 mov %eax,%fs:(%rcx) >>> 15: 48 rex.W >>> >>> >>> The kernel config and materials to reproduce are available at: >>> https://download.01.org/0day-ci/archive/20240926/202409260949.a1254989-oliver.sang@xxxxxxxxx >> >> This seems to be related to [1] and __builtin_dynamic_object_size(). >> >> Compared to GCC, Clang's __bdos() is off by 4 bytes. >> >> I made a small PoC: https://godbolt.org/z/vvK9PE1Yq >> >> Thanks, >> Thorsten >> >> [1] https://lore.kernel.org/all/20240913164630.GA4091534@thelio-3990X/ >> Cc: Bill Wendling > > This should probably be reverted or dropped like that change was because > this breaks boot for me on when UBSAN_BOUNDS is enabled (like it is for > Fedora's configuration now, which I use for some machines). > > It looks like Bill is working on a fix: > > https://github.com/llvm/llvm-project/pull/110487 > > I see a bit of discussion going on there so I have not tried to verify > if that fix addresses both reported issues. I wonder if we should stop > trying to add __counted_by() annotations until this issue is fully > addressed, backported, and accounted for in the kernel sources? It seems > like this is relatively easy to trip up (at least from my perspective, > since I have hit this twice in the past couple of months) and only > released versions of clang have __counted_by(), so those users are more > likely to get hit with this issue. I do not think too many people test > with tip of tree GCC (or maybe they do but not with UBSAN_BOUNDS > enabled). > > Cheers, > Nathan There appear to be two separate Clang __bdos() issues that the __counted_by() annotations can run into. This one https://github.com/llvm/llvm-project/pull/110487 has been fixed and merged into main. The other issue is here: https://github.com/llvm/llvm-project/issues/111009 but the fix (https://github.com/llvm/llvm-project/pull/111015) is still being discussed. This is the one that Nathan ran into with this posix_acl patch and the __counted_by() annotation. A simple workaround is to reorder the struct's members such that Clang's __bdos() returns the same size as GCC's __bdos(). I just submitted a patch here: https://lore.kernel.org/lkml/20241004103401.38579-2-thorsten.blum@xxxxxxxxx/