Re: [PATCH v3 7/7] mm/gup: Allow VM_FAULT_RETRY for multiple times

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 11, 2019 at 8:11 AM Peter Xu <peterx@xxxxxxxxxx> wrote:
>
> This is the gup counterpart of the change that allows the
> VM_FAULT_RETRY to happen for more than once.  One thing to mention is
> that we must check the fatal signal here before retry because the GUP
> can be interrupted by that, otherwise we can loop forever.

I still get nervous about the signal handling here.

I'm not entirely sure we get it right even before your patch series.

Right now, __get_user_pages() can return -ERESTARTSYS when it's killed:

        /*
         * If we have a pending SIGKILL, don't keep faulting pages and
         * potentially allocating memory.
         */
        if (fatal_signal_pending(current)) {
                ret = -ERESTARTSYS;
                goto out;
        }

and I don't think your series changes that.  And note how this is true
_regardless_ of any FOLL_xyz flags (and we don't pass the
FAULT_FLAG_xyz flags at all, they get generated deeper down if we
actually end up faulting things in).

So this part of the patch:

+               if (fatal_signal_pending(current))
+                       goto out;
+
                *locked = 1;
-               lock_dropped = true;
                down_read(&mm->mmap_sem);
                ret = __get_user_pages(tsk, mm, start, 1, flags | FOLL_TRIED,
-                                      pages, NULL, NULL);
+                                      pages, NULL, locked);
+               if (!*locked) {
+                       /* Continue to retry until we succeeded */
+                       BUG_ON(ret != 0);
+                       goto retry;

just makes me go "that can't be right". The fatal_signal_pending() is
pointless and would probably better be something like

        if (down_read_killable(&mm->mmap_sem) < 0)
                goto out;

and then _after_ calling __get_user_pages(), the whole "negative error
handling" should be more obvious.

The BUG_ON(ret != 0) makes me nervous, but it might be fine (I guess
the fatal signal handling has always been done before the lock is
released?).

But exactly *because* __get_user_pages() can already return on fatal
signals, I think it should also set FAULT_FLAG_KILLABLE when faulting
things in. I don't think it does right now, so it doesn't actually
necessarily check fatal signals in a timely manner (not _during_ the
fault, only _before_ it).

See what I'm reacting to?

And maybe I'm wrong. Maybe I misread the code, and your changes. And
at least some of these logic error predate your changes, I just was
hoping that since this whole "react to signals" is very much what your
patch series is working on, you'd look at this.

Ok?

            Linus




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux