On Thu, May 11, 2023 at 10:07:06AM -0600, Alex Williamson wrote: > On Wed, 10 May 2023 17:41:06 -0300 > Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > > > On Mon, May 08, 2023 at 02:57:15PM -0600, Alex Williamson wrote: > > > > > We already try to set the flags in advance, but there are some > > > architectural flags like VM_PAT that make that tricky. Cedric has been > > > looking at inserting individual pages with vmf_insert_pfn(), but that > > > incurs a lot more faults and therefore latency vs remapping the entire > > > vma on fault. I'm not convinced that we shouldn't just attempt to > > > remove the fault handler entirely, but I haven't tried it yet to know > > > what gotchas are down that path. Thanks, > > > > I thought we did it like this because there were races otherwise with > > PTE insertion and zapping? I don't remember well anymore. > > TBH, I don't recall if we tried a synchronous approach previously. The > benefit of the faulting approach was that we could track the minimum > set of vmas which are actually making use of the mapping and throw that > tracking list away when zapping. Without that, we need to add vmas > both on mmap and in vm_ops.open, removing only in vm_ops.close, and > acquire all the proper mm locking for each vma to re-insert the > mappings. > > > I vaugely remember the address_space conversion might help remove the > > fault handler? > > Yes, this did remove the fault handler entirely, it's (obviously) > dropped off my radar, but perhaps in the interim we could switch to > vmf_insert_pfn() and revive the address space series to eventually > remove the fault handling and vma list altogether. > > For reference, I think this was the last posting of the address space > series: > > https://lore.kernel.org/all/162818167535.1511194.6614962507750594786.stgit@omen/ Just took a quick look at this series. A question is that looks it still needs to call io_remap_pfn_range() in places like vfio_basic_config_write() for PCI_COMMAND, and device reset, so mmap write lock is still required around vdev->memory_lock.