On 5/4/22 16:36, Reinette Chatre wrote: > The page fault handler cannot check for SGX_ENCL_PAGE_BEING_RECLAIMED > until the reclaimer releases the mutex. Ahh, got it now. I thought the mutex wasn't held on the reclaimer side. Thanks for the refresher. >> One other thing. The role of encl->lock here is very important. >> Without it, two concurrent page faults could do their individual >> memset(), each see !pcmd_page_empty, then decline to truncate the page. > > Your argument is not clear to me. Yes, this would be an issue (and > that will not be the only issue) if the enclave mutex is not held > by the page fault handler ... but the enclave mutex _is_ held by the > page fault handler. It would just be nice to clearly state the locking logic in the eventual changelog. >> Also, given the challenges here, I do think we should check the >> pcmd_page after truncate to ensure it is still all zero's. > > Could you please elaborate the challenges? I do not think it would > help the issue at hand at all to add a check since this is done with > the mutex held while the reclaimer has a reference to the page but no > mutex ... as long as the page fault handler has that mutex it can write > and check the PCMD page all it want, the new changes would occur to the > PCMD page when it (the page fault handler) releases the mutex and the > reclaimer attempts to use the page it has a reference to. I'm just asking for a debugging check that looks at the page *after* is has been truncated to ensure it is zero, in case another bug creeps in here.