On Tue, May 26, 2020 at 10:32:48AM -0700, Ralph Campbell wrote: > > On 5/25/20 6:41 AM, Jason Gunthorpe wrote: > > On Fri, May 08, 2020 at 12:20:03PM -0700, Ralph Campbell wrote: > > > hmm_range_fault() returns an array of page frame numbers and flags for > > > how the pages are mapped in the requested process' page tables. The PFN > > > can be used to get the struct page with hmm_pfn_to_page() and the page size > > > order can be determined with compound_order(page) but if the page is larger > > > than order 0 (PAGE_SIZE), there is no indication that the page is mapped > > > using a larger page size. To be fully general, hmm_range_fault() would need > > > to return the mapping size to handle cases like a 1GB compound page being > > > mapped with 2MB PMD entries. However, the most common case is the mapping > > > size the same as the underlying compound page size. > > > This series adds a new output flag to indicate this so that callers know it > > > is safe to use a large device page table mapping if one is available. > > > Nouveau and the HMM tests are updated to use the new flag. > > > > > > Note that this series depends on a patch queued in Ben Skeggs' nouveau > > > tree ("nouveau/hmm: map pages after migration") and the patches queued > > > in Jason's HMM tree. > > > There is also a patch outstanding ("nouveau/hmm: fix nouveau_dmem_chunk > > > allocations") that is independent of the above and could be applied > > > before or after. > > > > Did Christoph and Matt's remarks get addressed here? > > Both questioned the need to add the HMM_PFN_COMPOUND flag to the > hmm_range_fault() output array saying that the PFN can be used to get the > struct page pointer and the page can be examined to determine the page size. > My response is that while is true, it is also important that the device only > access the same parts of a large page that the process/cpu has access to. > There are places where a large page is mapped with smaller page table entries > when a page is shared by multiple processes. > After I explained this, I haven't seen any further comments from Christoph > and Matt. I'm still looking for reviews, acks, or suggested changes. Okay, well, we reached the merge window, so since there may be some conflicts repost again in three weeks. It would be more compelling if there was some performance data if it is much of a win vs the 'compute large page' algorithm something like ODP uses. Jason