On Mon, Jan 11, 2021 at 1:45 PM Tony Luck <tony.luck@xxxxxxxxx> wrote: > > Linux can now recover from machine checks where kernel code is > doing get_user() to access application memory. But there isn't > a way to distinguish whether get_user() failed because of a page > fault or a machine check. > > Thus there is a problem if any kernel code thinks it can retry > an access after doing something that would fix the page fault. > > One such example (I'm sure there are more) is in futex_wait_setup() > where an attempt to read the futex with page faults disabled. Then > a retry (after dropping a lock so page faults are safe): > > > ret = get_futex_value_locked(&uval, uaddr); > > if (ret) { > queue_unlock(*hb); > > ret = get_user(uval, uaddr); > > It would be good to avoid deliberately taking a second machine > check (especially as the recovery code does really bad things > and ends up in an infinite loop!). > > V2 (thanks to feedback from PeterZ) fixes this by changing get_user() to > return -ENXIO ("No such device or address") for the case where a machine > check occurred. Peter left it open which error code to use (suggesting > "-EMEMERR or whatever name we come up with"). I think the existing ENXIO > error code seems appropriate (the address being accessed has effectively > gone away). But I don't have a strong attachment if anyone thinks we > need a new code. > > Callers can check for ENXIO in paths where the access would be > retried so they can avoid a second machine check. > I don't love this -- I'm concerned that there will be some code path that expects a failing get_user() to return -EFAULT, not -ENXIO. Also, get_user() *can* return -EFAULT when it hits bad memory even with your patch if the recovery code manages to yank the PTE before get_user(). So I tend to think that the machine check code should arrange to survive some reasonable number of duplicate machine checks.