When current node doesn't have a EPC section configured by firmware and all other EPC sections memory are used up, CPU can stuck inside the while loop in __sgx_alloc_epc_page() forever and soft lockup will happen. Note how nid_of_current will never equal to nid in that while loop because nid_of_current is not set in sgx_numa_mask. Also worth mentioning is that it's perfectly fine for firmware to not seup an EPC section on a node. Setting an EPC section on each node can be good for performance but that's not a requirement functionality wise. Fixes: 901ddbb9ecf5 ("x86/sgx: Add a basic NUMA allocation scheme to sgx_alloc_epc_page()") Reported-by: Zhimin Luo <zhimin.luo@xxxxxxxxx> Tested-by: Zhimin Luo <zhimin.luo@xxxxxxxxx> Signed-off-by: Aaron Lu <aaron.lu@xxxxxxxxx> --- This issue is found by Zhimin when doing internal testing and no external bug report has been sent out so there is no Closes: tag. arch/x86/kernel/cpu/sgx/main.c | 27 ++++++++++++++------------- 1 file changed, 14 insertions(+), 13 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 1a000acd933a..694fcf7a5e3a 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -475,24 +475,25 @@ struct sgx_epc_page *__sgx_alloc_epc_page(void) { struct sgx_epc_page *page; int nid_of_current = numa_node_id(); - int nid = nid_of_current; + int nid_start, nid; - if (node_isset(nid_of_current, sgx_numa_mask)) { - page = __sgx_alloc_epc_page_from_node(nid_of_current); - if (page) - return page; - } - - /* Fall back to the non-local NUMA nodes: */ - while (true) { - nid = next_node_in(nid, sgx_numa_mask); - if (nid == nid_of_current) - break; + /* + * Try local node first. If it doesn't have an EPC section, + * fall back to the non-local NUMA nodes. + */ + if (node_isset(nid_of_current, sgx_numa_mask)) + nid_start = nid_of_current; + else + nid_start = next_node_in(nid_of_current, sgx_numa_mask); + nid = nid_start; + do { page = __sgx_alloc_epc_page_from_node(nid); if (page) return page; - } + + nid = next_node_in(nid, sgx_numa_mask); + } while (nid != nid_start); return ERR_PTR(-ENOMEM); } base-commit: a85536e1bce722cb184abbac98068217874bdd6e -- 2.45.2