On Fri, Apr 12, 2024 at 09:42:08AM +0100, Steven Price wrote: > static int change_page_range(pte_t *ptep, unsigned long addr, void *data) > @@ -41,6 +45,7 @@ static int change_page_range(pte_t *ptep, unsigned long addr, void *data) > pte = clear_pte_bit(pte, cdata->clear_mask); > pte = set_pte_bit(pte, cdata->set_mask); > > + /* TODO: Break before make for PROT_NS_SHARED updates */ > __set_pte(ptep, pte); > return 0; Oh, this TODO is problematic, not sure we can do it safely. There are some patches on the list to trap faults from other CPUs if they happen to access the page when broken but so far we pushed back as complex and at risk of getting the logic wrong. >From an architecture perspective, you are changing the output address and D8.16.1 requires a break-before-make sequence (FEAT_BBM doesn't help). So we either come up with a way to do BMM safely (stop_machine() maybe if it's not too expensive or some way to guarantee no accesses to this page while being changed) or we get the architecture clarified on the possible side-effects here ("unpredictable" doesn't help). > } > @@ -192,6 +197,43 @@ int set_direct_map_default_noflush(struct page *page) > PAGE_SIZE, change_page_range, &data); > } > > +static int __set_memory_encrypted(unsigned long addr, > + int numpages, > + bool encrypt) > +{ > + unsigned long set_prot = 0, clear_prot = 0; > + phys_addr_t start, end; > + > + if (!is_realm_world()) > + return 0; > + > + WARN_ON(!__is_lm_address(addr)); Just return from this function if it's not a linear map address. No point in corrupting other areas since __virt_to_phys() will get it wrong. > + start = __virt_to_phys(addr); > + end = start + numpages * PAGE_SIZE; > + > + if (encrypt) { > + clear_prot = PROT_NS_SHARED; > + set_memory_range_protected(start, end); > + } else { > + set_prot = PROT_NS_SHARED; > + set_memory_range_shared(start, end); > + } > + > + return __change_memory_common(addr, PAGE_SIZE * numpages, > + __pgprot(set_prot), > + __pgprot(clear_prot)); > +} Can someone summarise what the point of this protection bit is? The IPA memory is marked as protected/unprotected already via the RSI call and presumably the RMM disables/permits sharing with a non-secure hypervisor accordingly irrespective of which alias the realm guest has the linear mapping mapped to. What does it do with the top bit of the IPA? Is it that the RMM will prevent (via Stage 2) access if the IPA does not match the requested protection? IOW, it unmaps one or the other at Stage 2? Also, the linear map is not the only one that points to this IPA. What if this is a buffer mapped in user-space or remapped as non-cacheable (upgraded to cacheable via FWB) in the kernel, the code above does not (and cannot) change the user mappings. It needs some digging into dma_direct_alloc() as well, it uses a pgprot_decrypted() but that's not implemented by your patches. Not sure it helps, it looks like the remap path in this function does not have a dma_set_decrypted() call (or maybe I missed it). -- Catalin