On Thu, Aug 29, 2024 at 07:44:13PM +0300, Jarkko Sakkinen wrote: > On Thu Aug 29, 2024 at 5:38 AM EEST, Aaron Lu wrote: > > When current node doesn't have a EPC section configured by firmware and > > all other EPC sections memory are used up, CPU can stuck inside the > > while loop in __sgx_alloc_epc_page() forever and soft lockup will happen. > > Note how nid_of_current will never equal to nid in that while loop because > ~~~~ > > Oh *that* while loop ;-) Please be more specific. What about: Note how nid_of_current will never be equal to nid in the while loop that searches an available EPC page from remote nodes because nid_of_current is not set in sgx_numa_mask. > > nid_of_current is not set in sgx_numa_mask. > > > > Also worth mentioning is that it's perfectly fine for firmware to not > > seup an EPC section on a node. Setting an EPC section on each node can > > be good for performance but that's not a requirement functionality wise. > > This lacks any description of what is done to __sgx_alloc_epc_page(). Will add what Dave suggested on how the problem is fixed to the changelog. > > > > Fixes: 901ddbb9ecf5 ("x86/sgx: Add a basic NUMA allocation scheme to sgx_alloc_epc_page()") > > Reported-by: Zhimin Luo <zhimin.luo@xxxxxxxxx> > > Tested-by: Zhimin Luo <zhimin.luo@xxxxxxxxx> > > Signed-off-by: Aaron Lu <aaron.lu@xxxxxxxxx> Thanks, Aaron