On 7/15/21 12:51 PM, Sean Christopherson wrote:
On Thu, Jul 15, 2021, Brijesh Singh wrote:
On 7/14/21 5:25 PM, Sean Christopherson wrote:
@@ -2375,6 +2375,12 @@ int rmpupdate(struct page *page, struct rmpupdate *val)
if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
return -ENXIO;
+ ret = set_memory_4k((unsigned long)page_to_virt(page), 1);
IIUC, this shatters the direct map for page that's assigned to an SNP guest, and
the large pages are never recovered?
I believe a better approach would be to do something similar to memfd_secret[*],
which encountered a similar problem with the direct map. Instead of forcing the
direct map to be forever 4k, unmap the direct map when making a page guest private,
and restore the direct map when it's made shared (or freed).
I thought memfd_secret had also solved the problem of restoring large pages in
the direct map, but at a glance I can't tell if that's actually implemented
anywhere. But, even if it's not currently implemented, I think it makes sense
to mimic the memfd_secret approach so that both features can benefit if large
page preservation/restoration is ever added.
thanks for the memfd_secrets pointer. At the lowest level it shares the
same logic to split the physmap. We both end up calling to
change_page_attrs_set_clr() which split the page and updates the page
table attributes.
Given this, I believe in future if the change_page_attrs_set_clr() is
enhanced to track the splitting of the pages and restore it later then it
should work transparently.
But something actually needs to initiate the restore. If the RMPUDATE path just
force 4k pages then there will never be a restore. And zapping the direct map
for private pages is a good thing, e.g. prevents the kernel from reading garbage,
which IIUC isn't enforced by the RMP?
Yes, something need to initiated the restore. Since the restore support
is not present today so its difficult to say how it will be look. I am
thinking that restore thread may use some kind of notifier to check with
the caller whether its safe to restore the page ranges. In case of the
SEV-SNP, the SNP registered notifier will reject if the guest is running.
The memfd_secrets uses the set_direct_map_{invalid,default}_noflush()
and it is designed to remove/add the present bit in the direct map. We
can't use them, because in our case the page may get accessed by the KVM
(e.g kvm_guest_write, kvm_guest_map etc).
thanks