On Thu, Jul 19, 2018 at 07:13:39AM -0700, Dave Hansen wrote: > On 07/19/2018 01:59 AM, Kirill A. Shutemov wrote: > > On Wed, Jul 18, 2018 at 04:11:57PM -0700, Dave Hansen wrote: > >> On 07/17/2018 04:20 AM, Kirill A. Shutemov wrote: > >>> khugepaged allocates page in advance, before we found a VMA for > >>> collapse. We don't yet know which KeyID to use for the allocation. > >> > >> That's not really true. We have the VMA and the address in the caller > >> (khugepaged_scan_pmd()), but we drop the lock and have to revalidate the > >> VMA. > > > > For !NUMA we allocate the page in khugepaged_do_scan(), well before we > > know VMA. > > Ahh, thanks for clarifying. That's some more very good information > about the design and progression of your patch that belongs in the > changelog. Okay. > >>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c > >>> index 5ae34097aed1..d116f4ebb622 100644 > >>> --- a/mm/khugepaged.c > >>> +++ b/mm/khugepaged.c > >>> @@ -1056,6 +1056,16 @@ static void collapse_huge_page(struct mm_struct *mm, > >>> */ > >>> anon_vma_unlock_write(vma->anon_vma); > >>> > >>> + /* > >>> + * At this point new_page is allocated as non-encrypted. > >>> + * If VMA's KeyID is non-zero, we need to prepare it to be encrypted > >>> + * before coping data. > >>> + */ > >>> + if (vma_keyid(vma)) { > >>> + prep_encrypted_page(new_page, HPAGE_PMD_ORDER, > >>> + vma_keyid(vma), false); > >>> + } > >> > >> I guess this isn't horribly problematic now, but if we ever keep pools > >> of preassigned-keyids, this won't work any more. > > > > I don't get this. What pools of preassigned-keyids are you talking about? > > My point was that if we ever teach the allocator or something _near_ the > allocator to keep pools of pre-zeroed and/or pre-cache-cleared pages, > this approach will need to get changed otherwise we will double-prep pages. It shouldn't be a problem here. It's pretty slow path. We often wait memory to be compacted before page for khugepaged gets allocated. Double-prep shouldn't have visible impact. > My overall concern with prep_encrypted_page() in this patch set is that > it's inserted pretty ad-hoc. It seems easy to miss spots where it > should be. I'm also unsure of the failure mode and anything we've done > to ensure that if we get this wrong, we scream clearly and loudly about > what happened. Do we do something like that? I have debugging patch that puts BUG_ONs around set_pte_at() to check if the page's keyid matches VMA's keyid. But that's not very systematic. We would need something better than this. -- Kirill A. Shutemov