On Mon, Mar 22, 2021 at 11:56:37AM -0700, Sean Christopherson wrote: > Not necessarily. This can only trigger in the host, and thus require a host > reboot, if the host is also running enclaves. If the CSP is not running > enclaves, or is running its enclaves in a separate VM, then this path cannot be > reached. That's what I meant. Rebooting guests is a lot easier, ofc. Or are you saying, this can trigger *only* when they're running enclaves on the *host* too? > EREMOVE can only fail if there's a kernel or hardware bug (or a VMM bug if > running as a guest). We get those on a daily basis. > IME, nearly every kernel/KVM bug that I introduced that led to EREMOVE > failure was also quite fatal to SGX, i.e. this is just the canary in > the coal mine. > > It's certainly possible to add more sophisticated error handling, e.g. through > the pages onto a list and periodically try to recover them. But, since the vast > majority of bugs that cause EREMOVE failure are fatal to SGX, implementing > sophisticated handling is quite low on the list of priorities. > > Dave wanted the "page leaked" error message so that it's abundantly clear that > the kernel is leaking pages on EREMOVE failure and that the WARN isn't "benign". So this sounds to me like this should BUG too eventually. Or is this one of those "this should never happen" things so no one should worry? Whatever it is, if an admin sees this message in dmesg and doesn't get a lengthy explanation what she/he is supposed to do, I don't think she/he will be as relaxed. Hell, people open bugs for correctable ECCs and are asking whether they need to replace their hardware. So let's play this out: put yourself in an admin's shoes and tell me how should an admin react when she/he sees that? Should the kernel probably also say: "Don't worry, you have enough memory and what's a 4K, who cares? You'll reboot eventually." Or should the kernel say "You need to reboot ASAP." And so on... So what is the scenario here and what kind of reaction is that message supposed to cause, recovery action, blabla, the whole spiel? Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette