On Sun, Apr 14, 2024 at 04:08:11PM +0200, Björn Töpel wrote: > Andreas Dilger <adilger@xxxxxxxxx> writes: > > > On Apr 13, 2024, at 8:15 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > >> > >> On Sat, Apr 13, 2024 at 07:46:03PM -0600, Andreas Dilger wrote: > >> > >>> As to whether the 0xfffff000 address itself is valid for riscv32 is > >>> outside my realm, but given that RAM is cheap it doesn't seem unlikely > >>> to have 4GB+ of RAM and want to use it all. The riscv32 might consider > >>> reserving this page address from allocation to avoid similar issues in > >>> other parts of the code, as is done with the NULL/0 page address. > >> > >> Not a chance. *Any* page mapped there is a serious bug on any 32bit > >> box. Recall what ERR_PTR() is... > >> > >> On any architecture the virtual addresses in range (unsigned long)-512.. > >> (unsigned long)-1 must never resolve to valid kernel objects. > >> In other words, any kind of wraparound here is asking for an oops on > >> attempts to access the elements of buffer - kernel dereference of > >> (char *)0xfffff000 on a 32bit box is already a bug. > >> > >> It might be getting an invalid pointer, but arithmetical overflows > >> are irrelevant. > > > > The original bug report stated that search_buf = 0xfffff000 on entry, > > and I'd quoted that at the start of my email: > > > > On Apr 12, 2024, at 8:57 AM, Björn Töpel <bjorn@xxxxxxxxxx> wrote: > >> What I see in ext4_search_dir() is that search_buf is 0xfffff000, and at > >> some point the address wraps to zero, and boom. I doubt that 0xfffff000 > >> is a sane address. > > > > Now that you mention ERR_PTR() it definitely makes sense that this last > > page HAS to be excluded. > > > > So some other bug is passing the bad pointer to this code before this > > error, or the arch is not correctly excluding this page from allocation. > > Yeah, something is off for sure. > > (FWIW, I manage to hit this for Linus' master as well.) > > I added a print (close to trace_mm_filemap_add_to_page_cache()), and for > this BT: > > [<c01e8b34>] __filemap_add_folio+0x322/0x508 > [<c01e8d6e>] filemap_add_folio+0x54/0xce > [<c01ea076>] __filemap_get_folio+0x156/0x2aa > [<c02df346>] __getblk_slow+0xcc/0x302 > [<c02df5f2>] bdev_getblk+0x76/0x7a > [<c03519da>] ext4_getblk+0xbc/0x2c4 > [<c0351cc2>] ext4_bread_batch+0x56/0x186 > [<c036bcaa>] __ext4_find_entry+0x156/0x578 > [<c036c152>] ext4_lookup+0x86/0x1f4 > [<c02a3252>] __lookup_slow+0x8e/0x142 > [<c02a6d70>] walk_component+0x104/0x174 > [<c02a793c>] path_lookupat+0x78/0x182 > [<c02a8c7c>] filename_lookup+0x96/0x158 > [<c02a8d76>] kern_path+0x38/0x56 > [<c0c1cb7a>] init_mount+0x5c/0xac > [<c0c2ba4c>] devtmpfs_mount+0x44/0x7a > [<c0c01cce>] prepare_namespace+0x226/0x27c > [<c0c011c6>] kernel_init_freeable+0x286/0x2a8 > [<c0b97ab8>] kernel_init+0x2a/0x156 > [<c0ba22ca>] ret_from_fork+0xe/0x20 > > I get a folio where folio_address(folio) == 0xfffff000 (which is > broken). > > Need to go into the weeds here... I don't see anything obvious that could explain this right away. Did you manage to reproduce this on any other architecture and/or filesystem? Fwiw, iirc there were a bunch of fs/buffer.c changes that came in through the mm/ layer between v6.7 and v6.8 that might also be interesting. But really I'm poking in the dark currently.