On 10/11/22 22:09, Zhiquan Li wrote: >>> application inside the guest. >>> >>> To fix this, send a SIGBUS to host userspace (like QEMU) which can >>> follow up by injecting a #MC to the guest. >> >> This doesn't make any sense to me. It's *ALREADY* sending a SIGBUS. >> So, whatever is making this better, it's not "send a SIGBUS" that's >> doing it. >> >> What does this patch actually do to reduce the blast radius? > > Thanks for your attention, Dave. > > This part comes from your comments previously: > > https://lore.kernel.org/linux-sgx/Yrf27fugD7lkyaek@xxxxxxxxxx/T/#m6d62670eb530fab178eefaaaf31a22c4475e818d > > The key is the SIGBUS should with code BUS_MCEERR_AR and virtual address > of virtual EPC page. Hypervisor can identify it with the specific code > and inject #MC to the guest. > > Can we change the statement like this? > > To fix this, send a SIGBUS with code BUS_MCEERR_AR and virtual > address of virtual EPC page to host userspace (like QEMU) which can > follow up by injecting a #MC to the guest. This is really just mechanically restating what the patch does. It doesn't help me understand how it achieves the goal. I guess I'll just go and write the changelog for you. Here's what I was missing: There is already a signal-based ABI to tell userspace about machine checks. But, SGX does not use that ABI. Today, the kernel delivers a generic SIGBUS if a machine check occurs when accessing SGX memory. Userspace can not differentiate that SIGBUS from <add example here>, so it is very unlikely to be able to recover from the signal and the app will die. To fix this, have the SGX machine check code generate a SIGBUS which leverages the existing BUS_MCEERR_AR ABI. This enlightens userspace about why the SIGBUS was generated and gives it a chance of being able to handle the signal. QEMU, for instance, has code to handle these BUS_MCEERR_AR signals today. Without this patch, QEMU will just die in the face of a generic SIGBUS, and take the whole VM with it. With this patch <explain what QEMU actually does here>. In short, BUS_MCEERR_AR enables QEMU to reduce the blast radius down from the whole QEMU process to a single page. This patch doesn't *actually* reduce the blast radius. It enables QEMU to do that. >>> SGX virtual EPC driver doesn't explicitly prevent virtual EPC instance >>> being shared by multiple VMs via fork(). However KVM doesn't support >>> running a VM across multiple mm structures, and the de facto userspace >>> hypervisor (Qemu) doesn't use fork() to create a new VM, so in practice >>> this should not happen. >> >> This is out of the blue. Why is this here? >> >> What happens if a hypervisor *DOES* fork()? What's the fallout? > > This part originates from below discussion: > > https://lore.kernel.org/linux-sgx/52dc7f50b68c99cecb9e1c3383d9c6d88734cd67.camel@xxxxxxxxx/#t > > It intents to answer the question: > > Do you think the processes sharing the same enclave need to be > killed, even they had not touched the EPC page with hardware error? > > Dave, do you mean it's not appropriate to be put here? It's actually a pretty important point, but it's still out of the blue. You also didn't answer my question.