On Fri, Mar 22, 2024 at 11:01:25AM +1100, Alistair Popple wrote: > > Dan Williams <dan.j.williams@xxxxxxxxx> writes: > > > Alistair Popple wrote: > >> > >> Alistair Popple <apopple@xxxxxxxxxx> writes: > >> > >> > Dan Williams <dan.j.williams@xxxxxxxxx> writes: > >> > > >> >> Alistair Popple wrote: > >> > > >> > I also noticed folio_anon() is not safe to call on a FS DAX page due to > >> > sharing PAGE_MAPPING_DAX_SHARED. > >> > >> Also it feels like I could be missing something here. AFAICT the > >> page->mapping and page->index fields can't actually be used outside of > >> fs/dax because they are overloaded for the shared case. Therefore > >> setting/clearing them could be skipped and the only reason for doing so > >> is so dax_associate_entry()/dax_disassociate_entry() can generate > >> warnings which should never occur anyway. So all that code is > >> functionally unnecessary. > > > > What do you mean outside of fs/dax, do you literally mean outside of > > fs/dax.c, or the devdax case (i.e. dax without fs-entanglements)? > > Only the cases fs dax pages might need it. ie. Not devdax which I > haven't looked at closely yet. > > > Memory > > failure needs ->mapping and ->index to rmap dax pages. See > > mm/memory-failure.c::__add_to_kill() and > > mm/memory-failure.c::__add_to_kill_fsdax() where that latter one is for > > cases where the fs needs has signed up to react to dax page failure. > > How does that work for reflink/shared pages which overwrite > page->mapping and page->index? Via reverse mapping in the *filesystem*, not the mm rmap stuff. pmem_pagemap_memory_failure() dax_holder_notify_failure() .notify_failure() xfs_dax_notify_failure() xfs_dax_notify_ddev_failure() xfs_rmap_query_range(xfs_dax_failure_fn) xfs_dax_failure_fn(rmap record) <grabs inode from cache> <converts range to file offset> mf_dax_kill_procs(inode->mapping, pgoff) collect_procs_fsdax(mapping, page) add_to_kill_fsdax(task) __add_to_kill(task) unmap_and_kill_tasks() Remember: in FSDAX, the pages are the storage media physically owned by the filesystem, not the mm subsystem. Hence answering questions like "who owns this page" can only be answered correctly by asking the filesystem. We shortcut that for pages that only have one owner - we just store the owner information in the page as a {mapping, offset} tuple. But when we have multiple owners, the only way to find all the {mapping, offset} tuples is to ask the filesystem to find all the owners of that page. Hence the special case values for page->mapping/page->index for pages over shared filesystem extents. These shared extents are communicated to the fsdax layer via the IOMAP_F_SHARED flag in the iomaps returned by the filesystem. This flag is the trigger for the special mapping share count behaviour to be used. e.g. see dax_insert_entry(iomap_iter) -> dax_associate_entry(shared) -> dax_page_share_get().... > Eg. in __add_to_kill() if *p is a shared fs > dax page then p->mapping == PAGE_MAPPING_DAX_SHARED and > page_address_in_vma(vma, p) will probably crash. As per above, we don't get the mapping from the page in those cases. If you haven't got access to the page though a filesystem method and guaranteed that truncate() can't free it from under you, then you're probably doing the wrong thing with fsdax... -Dave. -- Dave Chinner david@xxxxxxxxxxxxx