On Thu, 2023-07-06 at 09:41 -0700, Michael Kelley wrote: > To avoid these complexities of the CoCo exception handlers, change > the core transition code in __set_memory_enc_pgtable() to do the > following: > > 1. Remove aliasing mappings > 2. Remove the PRESENT bit from the PTEs of all transitioning pages This is a bit of an existing problem, but the failure cases of these set_memory_en/decrypted() operations does not look to be in great shape. It could fail halfway through if it needs to split the direct map under memory pressure, in which case some of the callers will see the error and free the unmapped pages to the direct map. (I was looking at dma_direct_alloc()) Other's just leak the pages. But the situation before the patch is not much better, since the direct map change or enc_status_change_prepare/finish() could fail and leave the pages in an inconsistent state, like this patch is trying to address. This lack of rollback on failure for CPA calls needs particular odd handling in all the set_memory() callers. The way is to make a CPA call to restore it to the previous permission, regardless of the error code returned in the initial call that failed. The callers depend on any PTE change successfully made having any needed splits already done for those PTEs, so the restore can succeed at least as far as the failed CPA call got. In this COCO case apparently the enc_status_change_prepare/finish() could fail too (and maybe not have the same forward progress behavior?). So I'm not sure what you can do in that case. I'm also not sure how bad it is to free encryption mismatched pages. Is it the same as freeing unmapped pages? (likely oops or panic) > 3. Flush the TLB globally > 4. Flush the data cache if needed > 5. Set/clear the encryption attribute as appropriate > 6. Notify the hypervisor of the page status change > 7. Add back the PRESENT bit