Re: [PATCH v9 0/3] x86/sgx: fine grained SGX MCA behavior

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2022/9/20 14:39, Zhiquan Li wrote:
> V8: https://lore.kernel.org/linux-sgx/20220913145330.2998212-1-zhiquan1.li@xxxxxxxxx/T/#t
> 
> Changes since V8:
> - Remove excess Acked-by from patch 02 and 03.
> 
> V7: https://lore.kernel.org/linux-sgx/YxEyRT2SbfBdYNfm@xxxxxxxxxx/T/#t
> 
> Changes since V7:
> - Enrich the motivation for renaming in commit message of patch 01 with
>   the explanation from Kai.
> - Add Acked-by from Jarkko.
>   Link: https://lore.kernel.org/linux-sgx/YxEyRT2SbfBdYNfm@xxxxxxxxxx/T/#mc1c93e7d9643588b27cefa9540f988a070469b5b
> - Add Acked-by from Kai Huang at patch 01.
> 
> V6: https://lore.kernel.org/linux-sgx/20220826160503.1576966-1-zhiquan1.li@xxxxxxxxx/T/#t
> 
> Changes since V6:
> - Revise the commit message of patch 01 suggested by Jarkko.
> - Fix build warning due to type changes.
> 
> V5: https://lore.kernel.org/linux-sgx/Yrf27fugD7lkyaek@xxxxxxxxxx/T/#t
> 
> Changes since V5:
> - Rename the 'owner' field as 'encl_owner' and update the references
>   as a separate patch.
> - To prevent casting the 'encl_owner' field, introduce a union with
>   another field - "vepc_vaddr", suggested by Dave Hansen.
> - Clean up the commit message of patch 02 suggested by Dave Hansen.
> - Remove patch 03 unless we have better reason to keep it.
> - Add Reviewed-by from Jarkko.
>   Link: https://lore.kernel.org/linux-sgx/Yrf27fugD7lkyaek@xxxxxxxxxx/T/#m379d00fc7f1d43726a42b3884637532061a8c0d1
> 
> V4: https://lore.kernel.org/linux-sgx/20220608032654.1764936-1-zhiquan1.li@xxxxxxxxx/T/#t
> 
> Changes since V4:
> - Switch the order of the two variables at patch 02 so all of variables
>   are in reverse Christmas style.
> - Do not initialize 'ret' because it will be overridden by the return
>   value of force_sig_mceerr() unconditionally.
> - Add Co-developed-by and Signed-off-by from Cathy Zhang at patch 01.
> - Add Acked-by from Kai Huang at patch 01.
> 
> V3: https://lore.kernel.org/linux-sgx/41704e5d4c03b49fcda12e695595211d950cfb08.camel@xxxxxxxxxx/T/#t
> 
> Changes since V3:
> - Take the definition of EPC page flag SGX_EPC_PAGE_KVM_GUEST from
>   Cathy Zhang's third patch of SGX rebootless recovery patch set but
>   discard irrelevant portion, since it might need some time to re-forge
>   and these are two different features.
>   Link: https://lore.kernel.org/linux-sgx/41704e5d4c03b49fcda12e695595211d950cfb08.camel@xxxxxxxxxx/T/#m9782d23496cacecb7da07a67daa79f4b322ae170
> 
> V2: https://lore.kernel.org/linux-sgx/694234d7-6a0d-e85f-f2f9-e52b4a61e1ec@xxxxxxxxx/T/#t
> 
> Changes since V2:
> - Repurpose the owner field as the virtual address of virtual EPC page
> - Remove struct sgx_vepc_page and relevant code.
> - Remove patch 01 as the changes are not necessary in new design.
> - Rework patch 02 suggested by Jarkko.
> - Adapt patch 03 and 04 since struct sgx_vepc_page was discarded.
> - Replace EPC page flag SGX_EPC_PAGE_IS_VEPC with
>   SGX_EPC_PAGE_KVM_GUEST as they are duplicated.
>   Link: https://lore.kernel.org/linux-sgx/eb95b32ecf3d44a695610cf7f2816785@xxxxxxxxx/T/#u
> 
> V1: https://lore.kernel.org/linux-sgx/443cb425-009c-2784-56f4-5e707122de76@xxxxxxxxx/T/#t
> 
> Changes since V1:
> - Updated cover letter and commit messages, added valuable
>   information from Jarkko, Tony and Kai's comments.
> - Added documentations for struct struct sgx_vepc and
>   struct sgx_vepc_page.
> 
> Hi everyone,
> 
> This series contains a few patches to fine grained SGX MCA behavior.
> Today, if a guest accesses an SGX EPC page with memory failure,
> the kernel behavior will kill the entire guest.  This blast radius is
> too large.  It would be idea to kill only the SGX application inside
> the guest.
> 
> To fix this, send a SIGBUS to host userspace (like QEMU) which can
> follow up by injecting a #MC to the guest.
> However, when a page triggers a machine check, it only reports the
> PFN.  But in order to inject #MC into hypervisor, the virtual address
> is required.  The 'encl_owner' field is useless in virtualization
> case, then repurpose it as 'vepc_vaddr' - the virtual address of the
> virtual EPC page for such case so that arch_memory_failure() can easily
> retrieve it.
> 
> Suppose an enclave is shared by multiple processes, when an enclave
> page triggers a machine check, the enclave will be disabled so that
> it couldn't be entered again.  Killing other processes with the same
> enclave mapped would perhaps be overkill, but they are going to find
> that the enclave is "dead" next time they try to use it.  Thanks for
> Jarkko’s head up and Tony’s clarification on this point.
> Unlike host enclaves, virtual EPC instance cannot be shared by multiple
> VMs. It is because how enclaves are created is totally up to the guest.
> Sharing virtual EPC instance will be very likely to unexpectedly break
> enclaves in all VMs.
> 
> SGX virtual EPC driver doesn't explicitly prevent virtual EPC instance
> being shared by multiple VMs via fork(). However KVM doesn't support
> running a VM across multiple mm structures, and the de facto userspace
> hypervisor (Qemu) doesn't use fork() to create a new VM, so in practice
> this should not happen.
> 
> This series is based on tip/x86/sgx.
> 
> Tests:
> 1. MCE injection test for SGX in VM.
>    As we expected, the application was killed and VM was alive.
> 2. Kernel selftest/sgx: PASS
> 3. Internal SGX stress test: PASS
> 4. kmemleak test: No memory leakage detected.
> 
> Much appreciate your feedback.
> 
> Best Regards,
> Zhiquan
> 
> Zhiquan Li (3):
>   x86/sgx: Rename the owner field of struct sgx_epc_page as encl_owner
>   x86/sgx: Introduce union with vepc_vaddr field for virtualization case
>   x86/sgx: Fine grained SGX MCA behavior for virtualization
> 
>  arch/x86/kernel/cpu/sgx/main.c | 48 +++++++++++++++++++++++++---------
>  arch/x86/kernel/cpu/sgx/sgx.h  |  8 +++++-
>  arch/x86/kernel/cpu/sgx/virt.c |  4 ++-
>  3 files changed, 46 insertions(+), 14 deletions(-)
> 
Hello Boris,

Could you please help to take a look at this common feature which is
asked from CSP customer who has deployed SGX instance already?

The patchset had gotten ACK/Review-by from Jarkko, and Kai, and the
design has been stable for quite a while with through validation running
background.

Just wondering if you can shepherd the patchset for sgx tip tree for
v6.1 merge window?

Much appreciated.

Best Regards,
Zhiquan



[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux