From: Edgecombe, Rick P <rick.p.edgecombe@xxxxxxxxx> Sent: Thursday, August 31, 2023 10:30 AM > > On Wed, 2023-08-30 at 16:40 -0700, Rick Edgecombe wrote: > > This is a bit of an existing problem, but the failure cases of these > > set_memory_en/decrypted() operations does not look to be in great > > shape. It could fail halfway through if it needs to split the direct > > map under memory pressure, in which case some of the callers will see > > the error and free the unmapped pages to the direct map. (I was > > looking at dma_direct_alloc()) Other's just leak the pages. See further comments below. > > > > But the situation before the patch is not much better, since the direct > > map change or enc_status_change_prepare/finish() could fail and leave > > the pages in an inconsistent state, like this patch is trying to address. > > > > This lack of rollback on failure for CPA calls needs particular odd > > handling in all the set_memory() callers. The way is to make a CPA call > > to restore it to the previous permission, regardless of the error code > > returned in the initial call that failed. The callers depend on any PTE > > change successfully made having any needed splits already done for > > those PTEs, so the restore can succeed at least as far as the failed > > CPA call got. > > Wait, since this does set_memory_np() as the first step for both > set_memory_encrypted() and set_memory_decrypted(), that pattern in the > callers wouldn't work. I wonder if it should try to rollback itself if > set_memory_np() fails (call set_memory_p() before returning the error). > At least that will handle failures that happen on the guest side. Yes, I agree the error handling is very limited. I'll try to make my patch cleanup properly if set_memory_np() fails as step 1. In general, complete error cleanup on private <-> shared transitions looks to be pretty hard, and the original implementation obviously didn't deal with it. For most of the steps in the sequence, a failure indicates something is pretty seriously broken with the CoCo aspects of the VM, and it's not clear that trying to clean up is likely to succeed or will make things any better. > > > > > In this COCO case apparently the enc_status_change_prepare/finish() > > could fail too (and maybe not have the same forward progress > > behavior?). So I'm not sure what you can do in that case. Exactly. > > > > I'm also not sure how bad it is to free encryption mismatched pages. Is > > it the same as freeing unmapped pages? (likely oops or panic) It's bad to free mismatched pages. You're just setting a time bomb for some other code to encounter later. It will either cause an oops/panic, or will allow the VM to unknowingly put data in a page that is visible to the hypervisor and violate the tenets of being a CoCo VM. In the code I've worked with, we don't free any memory where set_memory_encrypted() or set_memory_decrypted() has failed. We assume there hasn't been a cleanup and hence the state and consistency of the memory is unknown. I think there's a decent argument for not investing too much effort in the cleanup paths for private <-> shared transitions. A failure indicates that something is seriously wrong, from which full recovery is unlikely. Callers of set_memory_encrypted()/decrypted() should assume that a failure leaves the memory in an inconsistent state, and the memory should just be leaked until the VM can be shutdown. Some strong comments along these lines could be added to set_memory_encrypted()/decrypted(). I'm going to be out on vacation until mid-September, so it will be a couple of weeks before I get back to doing a full patch set that responds to these and other comments. Michael