On Fri 2019-05-24 15:22:51, Hugh Dickins wrote: > On Wed, 22 May 2019, Sebastian Andrzej Siewior wrote: > > On 2019-05-22 12:21:13 [-0700], Andrew Morton wrote: > > > On Tue, 14 May 2019 17:29:55 +0300 Mike Rapoport <rppt@xxxxxxxxxxxxx> wrote: > > > > > > > When get_user_pages*() is called with pages = NULL, the processing of > > > > VM_FAULT_RETRY terminates early without actually retrying to fault-in all > > > > the pages. > > > > > > > > If the pages in the requested range belong to a VMA that has userfaultfd > > > > registered, handle_userfault() returns VM_FAULT_RETRY *after* user space > > > > has populated the page, but for the gup pre-fault case there's no actual > > > > retry and the caller will get no pages although they are present. > > > > > > > > This issue was uncovered when running post-copy memory restore in CRIU > > > > after commit d9c9ce34ed5c ("x86/fpu: Fault-in user stack if > > > > copy_fpstate_to_sigframe() fails"). > > I've been getting unexplained segmentation violations, and "make" giving > up early, when running kernel builds under swapping memory pressure: no > CRIU involved. > > Bisected last night to that same x86/fpu commit, not itself guilty, but > suffering from the odd behavior of get_user_pages_unlocked() giving up > too early. > > (I wondered at first if copy_fpstate_to_sigframe() ought to retry if > non-negative ret < nr_pages, but no, that would be wrong: a present page > followed by an invalid area would repeatedly return 1 for nr_pages 2.) > > Cc'ing Pavel, who's been having segfault trouble in emacs: maybe same? The emacs segfault was always during process exit. This sounds different... I don't see problems with make. But its true that at least one of affected machines uses swap heavily. Best regards, Pavel