On Tue, 14 May 2019 17:29:55 +0300 Mike Rapoport <rppt@xxxxxxxxxxxxx> wrote: > When get_user_pages*() is called with pages = NULL, the processing of > VM_FAULT_RETRY terminates early without actually retrying to fault-in all > the pages. > > If the pages in the requested range belong to a VMA that has userfaultfd > registered, handle_userfault() returns VM_FAULT_RETRY *after* user space > has populated the page, but for the gup pre-fault case there's no actual > retry and the caller will get no pages although they are present. > > This issue was uncovered when running post-copy memory restore in CRIU > after commit d9c9ce34ed5c ("x86/fpu: Fault-in user stack if > copy_fpstate_to_sigframe() fails"). > > After this change, the copying of FPU state to the sigframe switched from > copy_to_user() variants which caused a real page fault to get_user_pages() > with pages parameter set to NULL. You're saying that argument buf_fx in copy_fpstate_to_sigframe() is NULL? If so was that expected by the (now cc'ed) developers of d9c9ce34ed5c8923 ("x86/fpu: Fault-in user stack if copy_fpstate_to_sigframe() fails")? It seems rather odd. copy_fpregs_to_sigframe() doesn't look like it's expecting a NULL argument. Also, I wonder if copy_fpstate_to_sigframe() would be better using fault_in_pages_writeable() rather than get_user_pages_unlocked(). That seems like it operates at a more suitable level and I guess it will fix this issue also. > In post-copy mode of CRIU, the destination memory is managed with > userfaultfd and lack of the retry for pre-fault case in get_user_pages() > causes a crash of the restored process. > > Making the pre-fault behavior of get_user_pages() the same as the "normal" > one fixes the issue. Should this be backported into -stable trees? > Fixes: d9c9ce34ed5c ("x86/fpu: Fault-in user stack if copy_fpstate_to_sigframe() fails") > Signed-off-by: Mike Rapoport <rppt@xxxxxxxxxxxxx>