On Fri 07-09-18 16:45:09, Aneesh Kumar K.V wrote: > On 09/07/2018 02:33 PM, Michal Hocko wrote: > > On Thu 06-09-18 19:00:43, Aneesh Kumar K.V wrote: > > > On 09/06/2018 06:23 PM, Michal Hocko wrote: > > > > On Thu 06-09-18 11:13:42, Aneesh Kumar K.V wrote: > > > > > Current code doesn't do page migration if the page allocated is a compound page. > > > > > With HugeTLB migration support, we can end up allocating hugetlb pages from > > > > > CMA region. Also THP pages can be allocated from CMA region. This patch updates > > > > > the code to handle compound pages correctly. > > > > > > > > > > This use the new helper get_user_pages_cma_migrate. It does one get_user_pages > > > > > with right count, instead of doing one get_user_pages per page. That avoids > > > > > reading page table multiple times. > > > > > > > > > > The patch also convert the hpas member of mm_iommu_table_group_mem_t to a union. > > > > > We use the same storage location to store pointers to struct page. We cannot > > > > > update alll the code path use struct page *, because we access hpas in real mode > > > > > and we can't do that struct page * to pfn conversion in real mode. > > > > > > > > I am not fmailiar with this code so bear with me. I am completely > > > > missing the purpose of this patch. The changelog doesn't really explain > > > > that AFAICS. I can only guess that you do not want to establish long > > > > pins on CMA pages, right? So whenever you are about to pin a page that > > > > is in CMA you migrate it away to a different !__GFP_MOVABLE page, right? > > > > > > That is right. > > > > > > > If that is the case then how do you handle pins which are already in > > > > zone_movable? I do not see any specific check for those. > > > > > > > > > > > > > > Btw. why is this a proper thing to do? Problems with longterm pins are > > > > not only for CMA/ZONE_MOVABLE pages. Pinned pages are not reclaimable as > > > > well so there is a risk of OOMs if there are too many of them. We have > > > > discussed approaches that would allow to force pin invalidation/revocation > > > > at LSF/MM. Isn't that a more appropriate solution to the problem you are > > > > seeing? > > > > > > > > > > The CMA area is used on powerpc platforms to allocate guest specific page > > > table (hash page table). If we don't have sufficient free pages we fail to > > > allocate hash page table that result in failure to start guest. > > > > > > Now with vfio, we end up pinning the entire guest RAM. There is a > > > possibility that these guest RAM pages got allocated from CMA region. We > > > already do supporting migrating those pages out except for compound pages. > > > What this patch does is to start supporting compound page migration that got > > > allocated out of CMA region (ie, THP pages and hugetlb pages if platform > > > supported hugetlb migration). > > > > This definitely belongs to the changelog. > > > > > Now to do that I added a helper get_user_pages_cma_migrate(). > > > > > > I agree that long term pinned pages do have other issues. The patchset is > > > not solving that issue. > > > > It would be great to note why a generic approach is not viable. I assume > > the main reason is that those pins are pretty much permanent for the > > guest lifetime so the situation has to be handled in advance. In other > > words, more information please. > > > > That is correct. I will add these details to commit message. And will also > do a cover letter for the patch series. OK, then the early migration makes some sense. Although I suspect this will lead to other issues (OOM in kernel zones) but revocation approach is clearly not usable. An excessive pinning simply sucks. Thanks a lot for the updated information though! -- Michal Hocko SUSE Labs