On 16/10/21 00:29, Sean Christopherson wrote:
On Tue, Oct 12, 2021, Paolo Bonzini wrote:
diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c
index 59cdf3f742ac..81a0a0f22007 100644
--- a/arch/x86/kernel/cpu/sgx/virt.c
+++ b/arch/x86/kernel/cpu/sgx/virt.c
@@ -150,6 +150,46 @@ static int sgx_vepc_free_page(struct sgx_epc_page *epc_page)
return 0;
}
+static long sgx_vepc_remove_all(struct sgx_vepc *vepc)
+{
+ struct sgx_epc_page *entry;
+ unsigned long index;
+ long failures = 0;
+
+ xa_for_each(&vepc->page_array, index, entry) {
Might be worth a comment that xa_for_each() is safe to use concurrently with
xa_load/xa_store, i.e. this doesn't need to take vepc->lock.
I considered that to be part of the xarray contract (xa_store uses
rcu_assign_pointer so it has release semantics, and vepc->page_array is
essentially "store once").
It does raise the
question of whether or not the kernel is responsible for providing deterministic
results if userspace/guest is accessing previously-unallocated pages.
Garbage in, garbage out -- but you're right below that garbage in, WARN
out is not acceptable. I'm sending a v3 with documentation changes too.
Paolo
+ int ret = sgx_vepc_remove_page(entry);
I don't see anything that prevents userspace from doing SGX_IOC_VEPC_REMOVE_ALL
on multiple threads with the same vEPC. That means userspace can induce a #GP
due to concurrent access. Taking vepc->lock would solve that particular problem,
but I think that's a moot point because the EREMOVE locking rules are relative to
the SECS, not the individual page (because of refcounting). SGX_IOC_VEPC_REMOVE_ALL
on any two arbitrary vEPCs could induce a fault if they have children belonging to
the same enclave, i.e. share an SECS.
Sadly, I think this needs to be:
if (ret == SGX_CHILD_PRESENT)
failures++;
else if (ret)
return -EBUSY;
+ switch (ret) {
+ case 0:
+ break;
+
+ case SGX_CHILD_PRESENT:
+ failures++;
+ break;
+
+ case SGX_ENCLAVE_ACT:
+ /*
+ * Unlike in sgx_vepc_free_page, userspace could be calling
+ * the ioctl while logical processors are running in the
+ * enclave; do not warn.
+ */
+ return -EBUSY;
+
+ default:
+ WARN_ONCE(1, EREMOVE_ERROR_MESSAGE, ret, ret);
+ failures++;
+ break;
+ }
+ cond_resched();
+ }
+
+ /*
+ * Return the number of pages that failed to be removed, so
+ * userspace knows that there are still SECS pages lying
+ * around.
+ */
+ return failures;
+}