>> So it should be safe to grab and hold a mutex. See patch below. > > The mutex approach looks simpler and safer, so I'm fine with it. Thanks. Is that an "Acked-by:"? >> /** >> * memory_failure - Handle memory failure of a page. >> * @pfn: Page Number of the corrupted page >> @@ -1424,12 +1426,18 @@ int memory_failure(unsigned long pfn, int flags) >> return -ENXIO; >> } >> >> + mutex_lock(&mf_mutex); > > Is it better to take mutex before memory_failure_dev_pagemap() block? > Or we don't have to protect against race for device memory? No races (recovery is only attempted for errors in normal memory). -Tony