On Wed, 19 Feb 2025 14:32:35 -0600 Mitchell Augustin <mitchell.augustin@xxxxxxxxxxxxx> wrote: > > Thanks for the review and testing! > > Sure thing, thanks for the patch set! > > If you happen to have a few minutes, I'm struggling to understand the > epfn computation and would appreciate some insight. Sorry, this slipped off my todo list for a few days. > My current understanding (very possibly incorrect): > - epfn is intended to be the last page frame number that can be > represented at the mapping level corresponding to addr_mask. (so, if > addr_mask == PUD_MASK, epfn would be the highest pfn still in PUD > level). Actually epfn is the first pfn of the next addr_mask level page. The value in the parens (*pfn | (~addr_mask >> PAGE_SHIFT)) is the last pfn within the same level page. We could do it either way, it's just a matter of where the +1 gets added. > - ret should be == npages if all pfns in the requested vma are within > the memory hierarchy level denoted by addr_mask. If npages is more > than can be represented at that level, ret == the max number of page > frames representable at addr_mask level. Yes. > - - (if the second case is true, that means we were not able to obtain > all requested pages due to running out of PFNs at the current mapping > level) vaddr_get_pfns() is called again if we haven't reached npage. Specifically, from vfio_pin_pages_remote() we hit the added continue under the !batch->size branch. If the pfnmaps are fully PUD aligned, we'll call vaddr_get_pfns() once per PUD_SIZE, vfio_pin_pages_remote() will only return with the full requested npage value, and we'll only call vfio_iommu_map() once. The latter has always been true, the difference is the number of times we iterate calling vaddr_get_pfns(). > If the above is all correct, what is confusing me is where the "(*pfn) > | " comes into this equation. If epfn is meant to be the last pfn > representable at addr_mask level of the hierarchy, wouldn't that be > represented by (~pgmask >> PAGE_SHIFT) alone? (~addr_mask >> PAGE_SHIFT) gives us the last pfn relative to zero. We want the last pfn relative to *pfn, therefore we OR in *pfn. The OR handles any offset that *pfn might have within the addr_mask page, so this operation always provides the last pfn of the addr_mask page relative to *pfn. +1 because we want to calculate the number of pfns until the next page. Thanks, Alex