Hi Gerd, > > On Mon, May 15, 2023 at 10:04:42AM -0700, Mike Kravetz wrote: > > On 05/12/23 16:29, Mike Kravetz wrote: > > > On 05/12/23 14:26, James Houghton wrote: > > > > On Fri, May 12, 2023 at 12:20 AM Junxiao Chang > <junxiao.chang@xxxxxxxxx> wrote: > > > > > > > > This alone doesn't fix mapcounting for PTE-mapped HugeTLB pages. > You > > > > need something like [1]. I can resend it if that's what we should be > > > > doing, but this mapcounting scheme doesn't work when the page > structs > > > > have been freed. > > > > > > > > It seems like it was a mistake to include support for hugetlb memfds in > udmabuf. > > > > > > IIUC, it was added with commit 16c243e99d33 udmabuf: Add support for > mapping > > > hugepages (v4). Looks like it was never sent to linux-mm? That is > unfortunate > > > as hugetlb vmemmap freeing went in at about the same time. And, as > you have > > > noted udmabuf will not work if hugetlb vmemmap freeing is enabled. > > > > > > Sigh! > > > > > > Trying to think of a way forward. > > > -- > > > Mike Kravetz > > > > > > > > > > > [1]: https://lore.kernel.org/linux-mm/20230306230004.1387007-2- > jthoughton@xxxxxxxxxx/ > > > > > > > > - James > > > > Adding people and list on Cc: involved with commit 16c243e99d33. > > > > There are several issues with trying to map tail pages of hugetllb pages > > not taken into account with udmabuf. James spent quite a bit of time > trying > > to understand and address all the issues with the HGM code. While using > > the scheme proposed by James, may be an approach to the mapcount > issue there > > are also other issues that need attention. For example, I do not see how > > the fault code checks the state of the hugetlb page (such as poison) as none > > of that state is carried in tail pages. > > > > The more I think about it, the more I think udmabuf should treat hugetlb > > pages as hugetlb pages. They should be mapped at the appropriate level > > in the page table. Of course, this would impose new restrictions on the > > API (mmap and ioctl) that may break existing users. I have no idea how > > extensively udmabuf is being used with hugetlb mappings. > > User of this is qemu. It can use the udmabuf driver to create host > dma-bufs for guest resources (virtio-gpu buffers), to avoid copying > data when showing the guest display in a host window. > > hugetlb support is needed in case qemu guest memory is backed by > hugetlbfs. That does not imply the virtio-gpu buffers are hugepage > aligned though, udmabuf would still need to operate on smaller chunks > of memory. So with additional restrictions this will not work any > more for qemu. I'd suggest to just revert hugetlb support instead > and go back to the drawing board. > > Also not sure why hugetlbfs is used for guest memory in the first place. > It used to be a thing years ago, but with the arrival of transparent > hugepages there is as far I know little reason to still use hugetlbfs. The main reason why we are interested in using hugetlbfs for guest memory is because we observed non-trivial performance improvement while running certain 3D heavy workloads in the guest. And, we noticed this by only switching the Guest memory backend to include hugepages (i.e, hugetlb=on) and with no other changes. To address the current situation, I am readying a patch for udmabuf driver that would add back support for mapping hugepages but without making use of the subpages directly. Thanks, Vivek > > Vivek? Dongwon? > > take care, > Gerd