On Mon, Oct 11, 2021, Tony Luck wrote: > SGX EPC pages go through the following life cycle: > > DIRTY ---> FREE ---> IN-USE --\ > ^ | > \-----------------/ > > Recovery action for poison for a DIRTY or FREE page is simple. Just > make sure never to allocate the page. IN-USE pages need some extra > handling. > > Add a new flag bit SGX_EPC_PAGE_IN_USE that is set when a page > is allocated and cleared when the page is freed. > > Notes: > > 1) These transitions are made while holding the node->lock so that > future code that checks the flags while holding the node->lock > can be sure that if the SGX_EPC_PAGE_IN_USE bit is set, then the > page is on the free list. > > 2) Initially while the pages are on the dirty list the > SGX_EPC_PAGE_IN_USE bit is set. This needs to state _why_ pages are marked as IN_USE from the get-go. Ignoring the "Notes", the whole changelog clearly states the the DIRTY state does _not_ require special handling, but then "Add SGX infrastructure to recover from poison" goes and relies on it being set. Alternatively, why not invert it and have SGX_EPC_PAGE_FREE? That would have clear semantics, the poison recovery code wouldn't have to assume that !flags means "free", and the whole changelog becomes: Add a flag to explicitly track whether or not an EPC page is on a free list, memory failure recovery code needs to be able to detect if a poisoned page is free so that recovery can know if it's safe to "steal" the page.