On Thu, Oct 21, 2021 at 05:26:26PM +0200, Paolo Bonzini wrote: > On 06/10/21 01:44, Michael Roth wrote: > > SEV guests rely on an encyption bit which resides within the range that > > current code treats as address bits. Guest code will expect these bits > > to be set appropriately in their page tables, whereas helpers like > > addr_gpa2hva() will expect these bits to be masked away prior to > > translation. Add proper handling for these cases. > > This is not what you're doing below in addr_gpa2hva, though---or did I > misunderstand? The confusion is warranted, addr_gpa2hva() *doesn't* expect the C bit to be masked in advance so the wording is pretty confusing. I think I was referring the fact that internally it doesn't need/want the C-bit, in this case it just masks it away as a convenience to callers, as opposed to the other functions modified in the patch that actually make use of it. It's convenient because page table walkers/mappers make use of addr_gpa2hva() to do things like silently mask away C-bits via when translating PTEs to host addresses. We easily convert those callers from: addr_gpa2hva(paddr) to this: addr_gpa2hva(addr_raw2gpa(paddr)) but now all new code needs to consider whether it might be dealing with C-bits or not prior to deciding to pass it to addr_gpa2hva() (or not really think about it, and add addr_gpa2raw() "just in case"). So since it's always harmless to mask it away silently addr_gpa2hva(), the logic/code seems to benefit a good deal if we indicate clearly that addr_gpa2hva() can accept a 'raw' GPA, and will ignore it completely. But not a big deal either way if you prefer to keep that explicit. And commit message still needs to be clarified. > > I may be wrong due to not actually having written the code, but I'd prefer > if most of these APIs worked only if the C bit has already been stripped. > In general it's quite unlikely for host code to deal with C=1 pages, so it's > worth pointing out explicitly the cases where it does. I've tried to indicate functions that expect the C-bit by adding the 'raw_' prefix to the gpa/paddr parameters, but as you pointed out with addr_gpa2hva() it's already a bit inconsistent in that regard, and there's a couple cases like virt_map() where I should use the 'raw_' prefix as well that I've missed here. So that should be addressed, and maybe some additional comments/assertions might be warranted to guard against cases where the C-bit is passed in unexpectedly. But I should probably re-assess why the C-bit is being passed around in the first place: - vm_phy_page[s]_alloc() is the main 'source' for 'raw' GPAs with the C-bit set. it determines this based on vm_memcrypt encryption policy, and updates the encryption bitmask as well. - vm_phy_page[s]_alloc() is callable both in kvm_util lib as well as individual tests. - in theory, encoding the C-bit in the returned vm_paddr_t means that vm_phy_page[s]_alloc() callers can pass that directly into virt_map/virt_pg_map() and this will "just work" for both encrypted/non-encrypted guests. - by masking it away in addr_gpa2hva(), existing tests/code flow mostly "just works" as well. But taking a closer look, in cases where vm_phy_page[s]_alloc() is called directly by tests, like set_memory_region_test, emulator_error_test, and smm_test, that raw GPA is compared to hardcoded non-raw GPAs, so they'd still end up needing fixups to work with the proposed transparent-SEV-mode stuff. And future code would need to be written to account for this, so it doesn't really "just work" after all.. So it's worth considering the alternative approach of *not* encoding the C-bit into GPAs returned by vm_phy_page[s]_alloc(). That would likely involve introducing something like addr_gpa2raw(), which adds in the C-bit according to the encryption bitmap as-needed. If we do that: - virt_map()/virt_pg_map() still need to accept 'raw' GPAs, since they need to deal with cases where pages are being mapping that weren't allocated by vm_phy_page[s]_alloc(), and so aren't recorded in the bitmap. in those cases it is up to test code to provide the C-bit when needed (e.g. things like separate linear mappings for pa()-like stuff in guest code). - for cases where vm_phy_page[s]_alloc() determines whether the page is encrypted, addr_gpa2raw() needs to be used to add back the C-bit prior to passing it to virt_map()/virt_pg_map(), both in the library and the test code. vm_vaddr_* allocations would handle all this under the covers as they do now. So test code would need to consider cases where addr_gpa2raw() needs to be used to set the C-bit (which is basically only when they want to mix usage of the vm_phy_page[s]_alloc with their own mapping of the guest page tables, which doesn't seem to be done in any existing tests anyway). The library code would need these addr_gpa2raw() hooks in places where it calls virt_*map() internally. Probably just a handful of places though. Assuming there's no issues with this alternative approach that I may be missing, I'll look at doing it this way for the next spin. Even in this alternative approach though, having addr_gpa2hva() silently mask away C-bit still seems useful for the reasons above, but again, no strong feelings one way or the other on that. > > Paolo > > > @@ -1460,9 +1480,10 @@ void virt_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, > > * address providing the memory to the vm physical address is returned. > > * A TEST_ASSERT failure occurs if no region containing gpa exists. > > */ > > -void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa) > > +void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa_raw) > > { > > struct userspace_mem_region *region; >