On Wed, 2023-08-30 at 16:40 -0700, Rick Edgecombe wrote: > This is a bit of an existing problem, but the failure cases of these > set_memory_en/decrypted() operations does not look to be in great > shape. It could fail halfway through if it needs to split the direct > map under memory pressure, in which case some of the callers will see > the error and free the unmapped pages to the direct map. (I was > looking > at dma_direct_alloc()) Other's just leak the pages. > > But the situation before the patch is not much better, since the > direct > map change or enc_status_change_prepare/finish() could fail and leave > the pages in an inconsistent state, like this patch is trying to > address. > > This lack of rollback on failure for CPA calls needs particular odd > handling in all the set_memory() callers. The way is to make a CPA > call > to restore it to the previous permission, regardless of the error > code > returned in the initial call that failed. The callers depend on any > PTE > change successfully made having any needed splits already done for > those PTEs, so the restore can succeed at least as far as the failed > CPA call got. Wait, since this does set_memory_np() as the first step for both set_memory_encrypted() and set_memory_decrypted(), that pattern in the callers wouldn't work. I wonder if it should try to rollback itself if set_memory_np() fails (call set_memory_p() before returning the error). At least that will handle failures that happen on the guest side. > > In this COCO case apparently the enc_status_change_prepare/finish() > could fail too (and maybe not have the same forward progress > behavior?). So I'm not sure what you can do in that case. > > I'm also not sure how bad it is to free encryption mismatched pages. > Is > it the same as freeing unmapped pages? (likely oops or panic)