Hello Boris,
On 10/13/2022 10:15 AM, Borislav Petkov wrote:
On Mon, Jun 20, 2022 at 11:05:01PM +0000, Ashish Kalra wrote:
+static void snp_leak_pages(unsigned long pfn, unsigned int npages)
That function name looks wrong.
+{
+ WARN(1, "psc failed, pfn 0x%lx pages %d (leaking)\n", pfn, npages);
+ while (npages--) {
+ memory_failure(pfn, 0);
^^^^^^^^^^^^^^^^^^^^^^
Why?
The page was in FW state and we couldn't transition it back to HV/Shared
state, any access to this page can cause RMP #PF.
* This function is called by the low level machine check code
* of an architecture when it detects hardware memory corruption
* of a page. It tries its best to recover, which includes
* dropping pages, killing processes etc.
I don't think you wanna do that.
It looks like you want to prevent the page from being used again but not
mark it as PG_hwpoison and whatnot. PG_reserved perhaps?
* PG_reserved is set for special pages. The "struct page" of such a
* page should in general not be touched (e.g. set dirty) except by its
* owner.
If it is "still" accessed/touched then it can cause RMP #PF.
On the other hand,
* PG_hwpoison... Accessing is
* not safe since it may cause another machine check. Don't touch!
That sounds exactly the state we want these page(s) to be in ?
Another possibility is PG_error.
+ dump_rmpentry(pfn);
+ pfn++;
+ }
+}
+
+static int snp_reclaim_pages(unsigned long pfn, unsigned int npages, bool locked)
+{
+ struct sev_data_snp_page_reclaim data;
+ int ret, err, i, n = 0;
+
+ for (i = 0; i < npages; i++) {
+ memset(&data, 0, sizeof(data));
+ data.paddr = pfn << PAGE_SHIFT;
Oh wow, this is just silly. A struct for a single u64. Just use a
u64 paddr;
Ok.
directly. But we had this topic already...
+
+ if (locked)
Ew, that's never a good design - conditional locking.
There is a previous suggestion to change `sev_cmd_mutex` to some sort of
nesting lock type to clean up this if (locked) code, though AFAIK, there
is no support for nesting lock types.
+ ret = __sev_do_cmd_locked(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err);
+ else
+ ret = sev_do_cmd(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err);
<---- newline here.
+ if (ret)
+ goto cleanup;
+
+ ret = rmp_make_shared(pfn, PG_LEVEL_4K);
+ if (ret)
+ goto cleanup;
+
+ pfn++;
+ n++;
+ }
+
+ return 0;
+
+cleanup:
+ /*
+ * If failed to reclaim the page then page is no longer safe to
+ * be released, leak it.
+ */
+ snp_leak_pages(pfn, npages - n);
So this looks real weird: we go and reclaim pages, we hit an error
during reclaiming a page X somewhere in-between and then we go and mark
the *remaining* pages as not to be used?!
Why?
Why not only that *one* page which failed and then we continue with the
rest?!
I agree and will change to this approach.
+ return ret;
+}
+
+static inline int rmp_make_firmware(unsigned long pfn, int level)
+{
+ return rmp_make_private(pfn, 0, level, 0, true);
+}
That's a silly wrapper used only once. Just do at the callsite:
/* Mark this page as belonging to firmware */
rc = rmp_make_private(pfn, 0, level, 0, true);
Ok.
+
+static int snp_set_rmp_state(unsigned long paddr, unsigned int npages, bool to_fw, bool locked,
+ bool need_reclaim)
Tangential to the above, this is just nuts with those bool arguments.
Just look at the callsites: do you understand what they do?
snp_set_rmp_state(paddr, npages, true, locked, false);
what does that do? You need to go up to the definition of the function,
count the arguments and see what that "true" arg stands for.
I totally agree, this is simply unreadable.
And this has been mentioned previously too ...
This function can do a lot and when I read the call sites its hard to
see what its doing since we have a combination of arguments which tell
us what behavior is happening ...
What you should do instead is, have separate helpers which do only one
thing:
rmp_mark_pages_firmware();
rmp_mark_pages_shared();
rmp_mark_pages_...
and then have the *callers* issue snp_reclaim_pages() when needed. So you'd have
rmp_mark_pages_firmware();
rmp_mark_pages_shared()
and __snp_free_firmware_pages() would do
rmp_mark_pages_shared();
snp_reclaim_pages();
Actually, this only needs to call snp_reclaim_pages().
and so on.
And then if you need locking, the callers can decide which sev_do_cmd
variant to issue.
And then if you have common code fragments which you can unify into a
bigger helper function, *then* you can do that.
Instead of multiplexing it this way. Which makes it really hard to
follow what the code does.
Sure i will do this cleanup.
+ unsigned long pfn = __sme_clr(paddr) >> PAGE_SHIFT; /* Cbit maybe set in the paddr */
No side comments pls.
+ int rc, n = 0, i;
+
+ for (i = 0; i < npages; i++) {
+ if (to_fw)
+ rc = rmp_make_firmware(pfn, PG_LEVEL_4K);
+ else
+ rc = need_reclaim ? snp_reclaim_pages(pfn, 1, locked) :
+ rmp_make_shared(pfn, PG_LEVEL_4K);
+ if (rc)
+ goto cleanup;
+
+ pfn++;
+ n++;
+ }
+
+ return 0;
+
+cleanup:
+ /* Try unrolling the firmware state changes */
+ if (to_fw) {
+ /*
+ * Reclaim the pages which were already changed to the
+ * firmware state.
+ */
+ snp_reclaim_pages(paddr >> PAGE_SHIFT, n, locked);
+
+ return rc;
+ }
+
+ /*
+ * If failed to change the page state to shared, then its not safe
+ * to release the page back to the system, leak it.
+ */
+ snp_leak_pages(pfn, npages - n);
+
+ return rc;
+}
...
+void snp_free_firmware_page(void *addr)
+{
+ if (!addr)
+ return;
+
+ __snp_free_firmware_pages(virt_to_page(addr), 0, false);
+}
+EXPORT_SYMBOL(snp_free_firmware_page);
EXPORT_SYMBOL_GPL() ofc.
Thanks,
Ashish