On Mon, Sep 26, 2022 at 12:35:34PM +0200, David Hildenbrand wrote: > On 23.09.22 02:58, Kirill A . Shutemov wrote: > > On Mon, Sep 19, 2022 at 11:12:46AM +0200, David Hildenbrand wrote: > > > > diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h > > > > index 6325d1d0e90f..9d066be3d7e8 100644 > > > > --- a/include/uapi/linux/magic.h > > > > +++ b/include/uapi/linux/magic.h > > > > @@ -101,5 +101,6 @@ > > > > #define DMA_BUF_MAGIC 0x444d4142 /* "DMAB" */ > > > > #define DEVMEM_MAGIC 0x454d444d /* "DMEM" */ > > > > #define SECRETMEM_MAGIC 0x5345434d /* "SECM" */ > > > > +#define INACCESSIBLE_MAGIC 0x494e4143 /* "INAC" */ > > > > > > > > > [...] > > > > > > > + > > > > +int inaccessible_get_pfn(struct file *file, pgoff_t offset, pfn_t *pfn, > > > > + int *order) > > > > +{ > > > > + struct inaccessible_data *data = file->f_mapping->private_data; > > > > + struct file *memfd = data->memfd; > > > > + struct page *page; > > > > + int ret; > > > > + > > > > + ret = shmem_getpage(file_inode(memfd), offset, &page, SGP_WRITE); > > > > + if (ret) > > > > + return ret; > > > > + > > > > + *pfn = page_to_pfn_t(page); > > > > + *order = thp_order(compound_head(page)); > > > > + SetPageUptodate(page); > > > > + unlock_page(page); > > > > + > > > > + return 0; > > > > +} > > > > +EXPORT_SYMBOL_GPL(inaccessible_get_pfn); > > > > + > > > > +void inaccessible_put_pfn(struct file *file, pfn_t pfn) > > > > +{ > > > > + struct page *page = pfn_t_to_page(pfn); > > > > + > > > > + if (WARN_ON_ONCE(!page)) > > > > + return; > > > > + > > > > + put_page(page); > > > > +} > > > > +EXPORT_SYMBOL_GPL(inaccessible_put_pfn); > > > > > > Sorry, I missed your reply regarding get/put interface. > > > > > > https://lore.kernel.org/linux-mm/20220810092532.GD862421@xxxxxxxxxxxxxxxxxx/ > > > > > > "We have a design assumption that somedays this can even support non-page > > > based backing stores." > > > > > > As long as there is no such user in sight (especially how to get the memfd > > > from even allocating such memory which will require bigger changes), I > > > prefer to keep it simple here and work on pages/folios. No need to > > > over-complicate it for now. > > > > Sean, Paolo , what is your take on this? Do you have conrete use case of > > pageless backend for the mechanism in sight? Maybe DAX? > > The problem I'm having with this is how to actually get such memory into the > memory backend (that triggers notifiers) and what the semantics are at all > with memory that is not managed by the buddy. > > memfd with fixed PFNs doesn't make too much sense. What do you mean by "fixed PFN". It is as fixed as struct page/folio, no? PFN covers more possible backends. > When using DAX, what happens with the shared <->private conversion? Which > "type" is supposed to use dax, which not? > > In other word, I'm missing too many details on the bigger picture of how > this would work at all to see why it makes sense right now to prepare for > that. IIUC, KVM doesn't really care about pages or folios. They need PFN to populate SEPT. Returning page/folio would make KVM do additional steps to extract PFN and one more place to have a bug. -- Kiryl Shutsemau / Kirill A. Shutemov