[Public] Hello Peter, >> +static int snp_reclaim_pages(unsigned long pfn, unsigned int npages, >> +bool locked) { >> + struct sev_data_snp_page_reclaim data; >> + int ret, err, i, n = 0; >> + >> + for (i = 0; i < npages; i++) { >What about setting |n| here too, also the other increments. >for (i = 0, n = 0; i < npages; i++, n++, pfn++) Yes that is simpler. >> + memset(&data, 0, sizeof(data)); >> + data.paddr = pfn << PAGE_SHIFT; >> + >> + if (locked) >> + ret = __sev_do_cmd_locked(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err); >> + else >> + ret = sev_do_cmd(SEV_CMD_SNP_PAGE_RECLAIM, >> + &data, &err); > Can we change `sev_cmd_mutex` to some sort of nesting lock type? That could clean up this if (locked) code. > +static inline int rmp_make_firmware(unsigned long pfn, int level) { > + return rmp_make_private(pfn, 0, level, 0, true); } > + > +static int snp_set_rmp_state(unsigned long paddr, unsigned int npages, bool to_fw, bool locked, > + bool need_reclaim) >This function can do a lot and when I read the call sites its hard to see what its doing since we have a combination of arguments which tell us what behavior is happening, some of which are not valid (ex: to_fw == true and need_reclaim == true is an >invalid argument combination). to_fw is used to make a firmware page and need_reclaim is for freeing the firmware page, so they are going to be mutually exclusive. I actually can connect with it quite logically with the callers : snp_alloc_firmware_pages will call with to_fw = true and need_reclaim = false and snp_free_firmware_pages will do the opposite, to_fw = false and need_reclaim = true. That seems straightforward to look at. >Also this for loop over |npages| is duplicated from snp_reclaim_pages(). One improvement here is that on the current >snp_reclaim_pages() if we fail to reclaim a page we assume we cannot reclaim the next pages, this may cause us to snp_leak_pages() more pages than we actually need too. Yes that is true. >What about something like this? >static snp_leak_page(u64 pfn, enum pg_level level) { > memory_failure(pfn, 0); > dump_rmpentry(pfn); >} >static int snp_reclaim_page(u64 pfn, enum pg_level level) { > int ret; > struct sev_data_snp_page_reclaim data; > ret = sev_do_cmd(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err); > if (ret) > goto cleanup; > ret = rmp_make_shared(pfn, level); > if (ret) > goto cleanup; > return 0; >cleanup: > snp_leak_page(pfn, level) >} >typedef int (*rmp_state_change_func) (u64 pfn, enum pg_level level); >static int snp_set_rmp_state(unsigned long paddr, unsigned int npages, rmp_state_change_func state_change, rmp_state_change_func cleanup) { > struct sev_data_snp_page_reclaim data; > int ret, err, i, n = 0; > for (i = 0, n = 0; i < npages; i++, n++, pfn++) { > ret = state_change(pfn, PG_LEVEL_4K) > if (ret) > goto cleanup; > } > return 0; > cleanup: > for (; i>= 0; i--, n--, pfn--) { > cleanup(pfn, PG_LEVEL_4K); > } > return ret; >} >Then inside of __snp_alloc_firmware_pages(): >snp_set_rmp_state(paddr, npages, rmp_make_firmware, snp_reclaim_page); >And inside of __snp_free_firmware_pages(): >snp_set_rmp_state(paddr, npages, snp_reclaim_page, snp_leak_page); >Just a suggestion feel free to ignore. The readability comment could be addressed much less invasively by just making separate functions for each valid combination of arguments here. Like snp_set_rmp_fw_state(), snp_set_rmp_shared_state(), >snp_set_rmp_release_state() or something. >> +static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int >> +order, bool locked) { >> + unsigned long npages = 1ul << order, paddr; >> + struct sev_device *sev; >> + struct page *page; >> + >> + if (!psp_master || !psp_master->sev_data) >> + return NULL; >> + >> + page = alloc_pages(gfp_mask, order); >> + if (!page) >> + return NULL; >> + >> + /* If SEV-SNP is initialized then add the page in RMP table. */ >> + sev = psp_master->sev_data; >> + if (!sev->snp_inited) >> + return page; >> + >> + paddr = __pa((unsigned long)page_address(page)); >> + if (snp_set_rmp_state(paddr, npages, true, locked, false)) >> + return NULL; >So what about the case where snp_set_rmp_state() fails but we were able to reclaim all the pages? Should we be able to signal that to callers so that we could free |page| here? But given this is an error path already maybe we can optimize this in a >follow up series. Yes, we should actually tie in to snp_reclaim_pages() success or failure here in the case we were able to successfully unroll some or all of the firmware state change. > + > + return page; > +} > + > +void *snp_alloc_firmware_page(gfp_t gfp_mask) { > + struct page *page; > + > + page = __snp_alloc_firmware_pages(gfp_mask, 0, false); > + > + return page ? page_address(page) : NULL; } > +EXPORT_SYMBOL_GPL(snp_alloc_firmware_page); > + > +static void __snp_free_firmware_pages(struct page *page, int order, > +bool locked) { > + unsigned long paddr, npages = 1ul << order; > + > + if (!page) > + return; > + > + paddr = __pa((unsigned long)page_address(page)); > + if (snp_set_rmp_state(paddr, npages, false, locked, true)) > + return; > Here we may be able to free some of |page| depending how where inside of snp_set_rmp_state() we failed. But again given this is an error path already maybe we can optimize this in a follow up series. Yes, we probably should be able to free some of the page(s) depending on how many page(s) got reclaimed in snp_set_rmp_state(). But these reclamation failures may not be very common, so any failure is indicative of a bigger issue, it might be the case when there is a single page reclamation error it might happen with all the subsequent pages and so follow a simple recovery procedure, then handling a more complex recovery for a chunk of pages being reclaimed and another chunk not. Thanks, Ashish