Long story short, this should be good enough for the cases we actually can
handle? What am I missing?
I am not sure I follow. My point is that I fail to see any added value
of the check as it doesn't prevent the race (it fundamentally cannot as
the page can be poisoned at any time) but the failure path doesn't
put_page which is incorrect even for hwpoison pages.
Oh, I think you are right. If we have a page and return NULL we would
leak a reference.
Actually, we discussed in that thread handling this entirely
differently, which resulted in a v7 [1]; however Andrew moved forward
with this (outdated?) patch, maybe that was just a mistake?
Yes, I agree we should revert that patch for now.
Regarding the race comment: AFAIU e.g., [2], it's not really a problem
with a race, but rather some corner case issue that can happen if we
fail in memory_failure().
[1] https://lkml.kernel.org/r/20210406104123.451ee3c3@alex-virtual-machine
[2]
https://lkml.kernel.org/r/20210331015258.GB22060@xxxxxxxxxxxxxxxxxxxxxxxxxxx
--
Thanks,
David / dhildenb