On Wed, Sep 22, 2021 at 6:42 PM Dave Chinner <david@xxxxxxxxxxxxx> wrote: [..] > Hence this discussion leads me to conclude that fallocate() simply > isn't the right interface to clear storage hardware poison state and > it's much simpler for everyone - kernel and userspace - to provide a > pwritev2(RWF_CLEAR_HWERROR) flag to directly instruct the IO path to > clear hardware error state before issuing this user write to the > hardware. That flag would slot in nicely in dax_iomap_iter() as the gate for whether dax_direct_access() should allow mapping over error ranges, and then as a flag to dax_copy_from_iter() to indicate that it should compare the incoming write to known poison and clear it before proceeding. I like the distinction, because there's a chance the application did not know that the page had experienced data loss and might want the error behavior. The other service the driver could offer with this flag is to do a precise check of the incoming write to make sure it overlaps known poison and then repair the entire page. Repairing whole pages makes for a cleaner implementation of the code that tries to keep poison out of the CPU speculation path, {set,clear}_mce_nospec().