Hi Matthew/Baolin, On 1/18/2024 8:13 AM, Baolin Wang wrote: > > > On 1/18/2024 9:38 AM, Al Viro wrote: >> On Tue, Jan 16, 2024 at 03:53:35PM +0800, Baolin Wang wrote: >> >>> With checking the 'dentry.parent' and 'dentry.d_name.name' used by >>> dentry_name(), I can see dump_mapping() will output the invalid dentry >>> instead of crashing the system when this issue is reproduced again. >> >>> dentry_ptr = container_of(dentry_first, struct dentry, >>> d_u.d_alias); >>> - if (get_kernel_nofault(dentry, dentry_ptr)) { >>> + if (get_kernel_nofault(dentry, dentry_ptr) || >>> + !dentry.d_parent || !dentry.d_name.name) { >>> pr_warn("aops:%ps ino:%lx invalid dentry:%px\n", >>> a_ops, ino, dentry_ptr); >>> return; >> >> That's nowhere near enough. Your ->d_name.name can bloody well be >> pointing >> to an external name that gets freed right under you. Legitimately so. >> >> Think what happens if dentry has a long name (longer than would fit into >> the embedded array) and gets renamed name just after you copy it into >> a local variable. Old name will get freed. Yes, freeing is RCU-delayed, >> but I don't see anything that would prevent your thread losing CPU >> and not getting it back until after the sucker's been freed. > > Yes, that's possible. And this appears to be a use-after-free issue in > the existing code, which is different from the issue that my patch > addressed. > > So how about adding a rcu_read_lock() before copying the dentry to a > local variable in case the old name is freed? > We too seen the below crash while printing the dentry name. aops:shmem_aops ino:5e029 dentry name:"dev/zero" flags: 0x8000000000080006(referenced|uptodate|swapbacked|zone=2|kasantag=0x0) raw: 8000000000080006 ffffffc033b1bb60 ffffffc033b1bb60 ffffff8862537600 raw: 0000000000000001 0000000000000000 00000003ffffffff ffffff807fe64000 page dumped because: migration failure migrating pfn aef223 failed ret:1 page:000000009e72a120 refcount:3 mapcount:0 mapping:000000003325dda1 index:0x1 pfn:0xaef223 memcg:ffffff807fe64000 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Mem abort info: ESR = 0x0000000096000005 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x05: level 1 translation fault Data abort info: ISV = 0, ISS = 0x00000005 CM = 0, WnR = 0 user pgtable: 4k pages, 39-bit VAs, pgdp=000000090c12d000 [0000000000000000] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP dentry_name+0x1f8/0x3a8 pointer+0x3b0/0x6b8 vsnprintf+0x4a4/0x65c vprintk_store+0x168/0x4a8 vprintk_emit+0x98/0x218 vprintk_default+0x44/0x70 vprintk+0xf0/0x138 _printk+0x54/0x80 dump_mapping+0x17c/0x188 dump_page+0x1d0/0x2e8 offline_pages+0x67c/0x898 Not much comfortable with block layer internals, TMK, the below is what happening in the my case: memoffline dput() (offline_pages) (as part of closing of the shmem file) ------------ -------------------------------------- ....... 1) dentry_unlink_inode() hlist_del_init(&dentry->d_u.d_alias); 2) iput(): a) inode->i_state |= I_FREEING ..... b) evict_inode()->..->shmem_undo_range 1) get the folios with elevated refcount 3) do_migrate_range(): a) Because of the elevated refcount in 2.b.1, the migration of this page will be failed. 2) truncate_inode_folio() -> filemap_remove_folio(): (deletes from the page cache, set page->mapping=NULL, decrement the refcount on folio) b) Call dump_page(): 1) mapping = page_mapping(page); 2) dump_mapping(mapping) a) We unlinked the dentry in 1) thus dentry_ptr from host->i_dentry.first is not a proper one. b) dentry name print with %pd is resulting into the mentioned crash. At least in this case, I think __this patchset in its current form can help us__. Thanks, Charan