> On Jul 18, 2020, at 11:55 AM, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: > > > >> On Jul 17, 2020, at 3:46 PM, Pierre Sauter <pierre.sauter@xxxxxxx> wrote: >> >> Am Freitag, 17. Juli 2020, 19:56:09 CEST schrieb Kai-Heng Feng: >>>> Pierre, thanks for confirming! >>>> >>>> Kai-Heng suspected an upstream stable commit that is missing in 5.4.0-40, >>>> but I don't have any good suggestions. >>> >>> Well, Ubuntu's 5.4 kernel is based on upstream stable v5.4, so I asked users to test stable v5.4.51, however the feedback was negative, and that's the reason why I raised the issue here. >>> >>> Anyway, good to know that it's fixed in upstream stable, everything's good now! >>> Thanks for your effort Chuck. >>> >>> Kai-Heng >> >> Sorry to have caused premature happiness. Kai-Hengs last message reminded me >> that I had seen the bug earlier in the week on Ubuntu Mainline v.5.4.51. >> So I decided to rebuild vanilla v5.4.51 with Ubuntus config + KASAN, and voila. >> It seems that their config is just really good in exposing the bug on mount. I >> am off for the weekend, can do more testing next week. >> >> [ 21.664580] ================================================================== >> [ 21.664657] BUG: KASAN: slab-out-of-bounds in _copy_from_pages+0xed/0x210 [sunrpc] >> [ 21.664705] Write of size 64 at addr ffff8883b6b7d444 by task update-desktop-/1345 >> >> [ 21.664764] CPU: 0 PID: 1345 Comm: update-desktop- Not tainted 5.4.51 #1 >> [ 21.664765] Hardware name: XXXXXX >> [ 21.664766] Call Trace: >> [ 21.664771] dump_stack+0x96/0xca >> [ 21.664775] print_address_description.constprop.0+0x20/0x210 >> [ 21.664795] ? _copy_from_pages+0xed/0x210 [sunrpc] >> [ 21.664797] __kasan_report.cold+0x1b/0x41 >> [ 21.664816] ? _copy_from_pages+0xed/0x210 [sunrpc] >> [ 21.664819] kasan_report+0x14/0x20 >> [ 21.664820] check_memory_region+0x129/0x1b0 >> [ 21.664822] memcpy+0x38/0x50 >> [ 21.664840] _copy_from_pages+0xed/0x210 [sunrpc] >> [ 21.664859] xdr_shrink_pagelen+0x1d6/0x440 [sunrpc] >> [ 21.664877] xdr_align_pages+0x15f/0x580 [sunrpc] >> [ 21.664897] ? decode_setattr+0x120/0x120 [nfsv4] >> [ 21.664916] xdr_read_pages+0x44/0x290 [sunrpc] >> [ 21.664933] ? __decode_op_hdr+0x29/0x430 [nfsv4] >> [ 21.664950] nfs4_xdr_dec_readlink+0x238/0x390 [nfsv4] > > READLINK appears to be a common element in these splats. Is there > an especially large symbolic link in your home directory? Knowing > that might help me reproduce the problem here. > > You confirmed the crash does not occur in v5.5.19, but the 5.8-ish > kernel you tested was Ubuntu's. Do you have test results for a > stock upstream v5.8-rc5 kernel? > > Do you know if v5.6.19 has this issue? I have a workload that can reproduce this exact KASAN splat on v5.4.51. Looking into it now. >> [ 21.664966] ? nfs4_xdr_dec_read+0x3c0/0x3c0 [nfsv4] >> [ 21.664969] ? __kasan_slab_free+0x14e/0x180 >> [ 21.664985] ? nfs4_xdr_dec_read+0x3c0/0x3c0 [nfsv4] >> [ 21.665003] rpcauth_unwrap_resp_decode+0xaa/0x100 [sunrpc] >> [ 21.665009] gss_unwrap_resp+0x99d/0x1570 [auth_rpcgss] >> [ 21.665014] ? gss_destroy_cred+0x460/0x460 [auth_rpcgss] >> [ 21.665016] ? finish_task_switch+0x163/0x670 >> [ 21.665019] ? __switch_to_asm+0x34/0x70 >> [ 21.665023] ? gss_wrap_req+0x1700/0x1700 [auth_rpcgss] >> [ 21.665026] ? prepare_to_wait+0xea/0x2b0 >> [ 21.665045] rpcauth_unwrap_resp+0xac/0x100 [sunrpc] >> [ 21.665061] call_decode+0x454/0x7e0 [sunrpc] >> [ 21.665077] ? rpc_decode_header+0x10a0/0x10a0 [sunrpc] >> [ 21.665079] ? var_wake_function+0x140/0x140 >> [ 21.665095] ? call_transmit_status+0x31e/0x5d0 [sunrpc] >> [ 21.665110] ? rpc_decode_header+0x10a0/0x10a0 [sunrpc] >> [ 21.665127] __rpc_execute+0x204/0xbd0 [sunrpc] >> [ 21.665143] ? xprt_wait_for_reply_request_def+0x170/0x170 [sunrpc] >> [ 21.665160] ? rpc_exit+0xc0/0xc0 [sunrpc] >> [ 21.665162] ? __kasan_check_read+0x11/0x20 >> [ 21.665164] ? wake_up_bit+0x42/0x50 >> [ 21.665181] rpc_execute+0x1a0/0x1f0 [sunrpc] >> [ 21.665197] rpc_run_task+0x454/0x5e0 [sunrpc] >> [ 21.665213] nfs4_call_sync_custom+0x12/0x70 [nfsv4] >> [ 21.665229] nfs4_call_sync_sequence+0x143/0x1f0 [nfsv4] >> [ 21.665244] ? nfs4_call_sync_custom+0x70/0x70 [nfsv4] >> [ 21.665247] ? get_page_from_freelist+0x24d0/0x45f0 >> [ 21.665263] _nfs4_proc_readlink+0x1a6/0x250 [nfsv4] >> [ 21.665280] ? _nfs4_proc_getdeviceinfo+0x350/0x350 [nfsv4] >> [ 21.665282] ? release_pages+0x44b/0xca0 >> [ 21.665284] ? __mod_lruvec_state+0x8f/0x320 >> [ 21.665286] ? pagevec_lru_move_fn+0x18d/0x230 >> [ 21.665303] nfs4_proc_readlink+0x101/0x2c0 [nfsv4] >> [ 21.665320] ? nfs4_proc_link+0x1c0/0x1c0 [nfsv4] >> [ 21.665322] ? add_to_page_cache_locked+0x20/0x20 >> [ 21.665339] nfs_symlink_filler+0xdc/0x190 [nfs] >> [ 21.665341] do_read_cache_page+0x60e/0x1490 >> [ 21.665353] ? nfs4_do_lookup_revalidate+0x1a1/0x2d0 [nfs] >> [ 21.665365] ? nfs_get_link+0x370/0x370 [nfs] >> [ 21.665367] ? xas_load+0x23/0x250 >> [ 21.665369] ? pagecache_get_page+0x760/0x760 >> [ 21.665372] ? lockref_get_not_dead+0xe3/0x1c0 >> [ 21.665374] ? __kasan_check_write+0x14/0x20 >> [ 21.665376] ? lockref_get_not_dead+0xe3/0x1c0 >> [ 21.665378] ? __kasan_check_write+0x14/0x20 >> [ 21.665380] ? _raw_spin_lock+0x7b/0xd0 >> [ 21.665382] ? _raw_write_trylock+0x110/0x110 >> [ 21.665384] read_cache_page+0x4c/0x80 >> [ 21.665396] nfs_get_link+0x75/0x370 [nfs] >> [ 21.665399] trailing_symlink+0x6fe/0x810 >> [ 21.665411] ? nfs_destroy_readpagecache+0x20/0x20 [nfs] >> [ 21.665413] path_lookupat.isra.0+0x188/0x7d0 >> [ 21.665416] ? do_syscall_64+0x9f/0x3a0 >> [ 21.665418] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 >> [ 21.665420] ? path_parentat.isra.0+0x110/0x110 >> [ 21.665423] ? stack_trace_save+0x94/0xc0 >> [ 21.665424] ? stack_trace_consume_entry+0x170/0x170 >> [ 21.665427] filename_lookup+0x185/0x3b0 >> [ 21.665429] ? nd_jump_link+0x1d0/0x1d0 >> [ 21.665431] ? kasan_slab_free+0xe/0x10 >> [ 21.665434] ? __kasan_check_read+0x11/0x20 >> [ 21.665436] ? __check_object_size+0x249/0x316 >> [ 21.665438] ? strncpy_from_user+0x80/0x290 >> [ 21.665440] ? kmem_cache_alloc+0x180/0x250 >> [ 21.665442] ? getname_flags+0x100/0x520 >> [ 21.665444] user_path_at_empty+0x3a/0x50 >> [ 21.665447] vfs_statx+0xca/0x150 >> [ 21.665449] ? vfs_statx_fd+0x90/0x90 >> [ 21.665451] ? __kasan_slab_free+0x14e/0x180 >> [ 21.665453] __do_sys_newstat+0x9a/0x100 >> [ 21.665455] ? cp_new_stat+0x5d0/0x5d0 >> [ 21.665457] ? __kasan_check_write+0x14/0x20 >> [ 21.665459] ? _raw_spin_lock_irq+0x82/0xe0 >> [ 21.665461] ? _raw_read_lock_irq+0x50/0x50 >> [ 21.665464] ? __blkcg_punt_bio_submit+0x1c0/0x1c0 >> [ 21.665466] ? __kasan_check_write+0x14/0x20 >> [ 21.665469] ? switch_fpu_return+0x13a/0x2d0 >> [ 21.665471] ? fpregs_mark_activate+0x150/0x150 >> [ 21.665474] __x64_sys_newstat+0x54/0x80 >> [ 21.665476] do_syscall_64+0x9f/0x3a0 >> [ 21.665478] ? prepare_exit_to_usermode+0xee/0x1a0 >> [ 21.665480] entry_SYSCALL_64_after_hwframe+0x44/0xa9 >> [ 21.665482] RIP: 0033:0x7f6e05f5c49a >> [ 21.665485] Code: 00 00 75 05 48 83 c4 18 c3 e8 f2 24 02 00 66 90 f3 0f 1e fa 41 89 f8 48 89 f7 48 89 d6 41 83 f8 01 77 2d b8 04 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 06 c3 0f 1f 44 00 00 48 8b 15 c1 a9 0d 00 f7 >> [ 21.665486] RSP: 002b:00007fff043e5f18 EFLAGS: 00000246 ORIG_RAX: 0000000000000004 >> [ 21.665488] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f6e05f5c49a >> [ 21.665489] RDX: 00007fff043e5f20 RSI: 00007fff043e5f20 RDI: 000055af4c4ea5f0 >> [ 21.665490] RBP: 000055af4c4ea5f0 R08: 0000000000000001 R09: 0000000000000001 >> [ 21.665491] R10: 0000000000000017 R11: 0000000000000246 R12: 000055af4c4eede3 >> [ 21.665492] R13: 000055af4c4e81a0 R14: 000055af4c4e6940 R15: 000055af4c4e6990 >> >> [ 21.665508] Allocated by task 1345: >> [ 21.665532] save_stack+0x23/0x90 >> [ 21.665534] __kasan_kmalloc.constprop.0+0xcf/0xe0 >> [ 21.665536] kasan_slab_alloc+0xe/0x10 >> [ 21.665538] kmem_cache_alloc+0xd7/0x250 >> [ 21.665539] mempool_alloc_slab+0x17/0x20 >> [ 21.665541] mempool_alloc+0x126/0x330 >> [ 21.665558] rpc_malloc+0x1f2/0x270 [sunrpc] >> [ 21.665574] call_allocate+0x3b9/0x9d0 [sunrpc] >> [ 21.665591] __rpc_execute+0x204/0xbd0 [sunrpc] >> [ 21.665607] rpc_execute+0x1a0/0x1f0 [sunrpc] >> [ 21.665623] rpc_run_task+0x454/0x5e0 [sunrpc] >> [ 21.665638] nfs4_call_sync_custom+0x12/0x70 [nfsv4] >> [ 21.665653] nfs4_call_sync_sequence+0x143/0x1f0 [nfsv4] >> [ 21.665668] _nfs4_proc_readlink+0x1a6/0x250 [nfsv4] >> [ 21.665684] nfs4_proc_readlink+0x101/0x2c0 [nfsv4] >> [ 21.665698] nfs_symlink_filler+0xdc/0x190 [nfs] >> [ 21.665699] do_read_cache_page+0x60e/0x1490 >> [ 21.665701] read_cache_page+0x4c/0x80 >> [ 21.665713] nfs_get_link+0x75/0x370 [nfs] >> [ 21.665714] trailing_symlink+0x6fe/0x810 >> [ 21.665716] path_lookupat.isra.0+0x188/0x7d0 >> [ 21.665718] filename_lookup+0x185/0x3b0 >> [ 21.665719] user_path_at_empty+0x3a/0x50 >> [ 21.665721] vfs_statx+0xca/0x150 >> [ 21.665723] __do_sys_newstat+0x9a/0x100 >> [ 21.665725] __x64_sys_newstat+0x54/0x80 >> [ 21.665727] do_syscall_64+0x9f/0x3a0 >> [ 21.665729] entry_SYSCALL_64_after_hwframe+0x44/0xa9 >> >> [ 21.665743] Freed by task 0: >> [ 21.665762] (stack is not available) >> >> [ 21.665798] The buggy address belongs to the object at ffff8883b6b7cc80 >> which belongs to the cache rpc_buffers of size 2048 >> [ 21.665871] The buggy address is located 1988 bytes inside of >> 2048-byte region [ffff8883b6b7cc80, ffff8883b6b7d480) >> [ 21.665939] The buggy address belongs to the page: >> [ 21.665970] page:ffffea000edade00 refcount:1 mapcount:0 mapping:ffff88840afecc00 index:0x0 compound_mapcount: 0 >> [ 21.666029] flags: 0x17ffffc0010200(slab|head) >> [ 21.666059] raw: 0017ffffc0010200 dead000000000100 dead000000000122 ffff88840afecc00 >> [ 21.666107] raw: 0000000000000000 00000000800f000f 00000001ffffffff 0000000000000000 >> [ 21.666152] page dumped because: kasan: bad access detected >> >> [ 21.666197] Memory state around the buggy address: >> [ 21.666228] ffff8883b6b7d380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> [ 21.666272] ffff8883b6b7d400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> [ 21.666315] >ffff8883b6b7d480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >> [ 21.666358] ^ >> [ 21.666379] ffff8883b6b7d500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >> [ 21.666423] ffff8883b6b7d580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >> [ 21.666465] ================================================================== >> [ 21.666509] Disabling lock debugging due to kernel taint >> >> >> > > -- > Chuck Lever -- Chuck Lever