Re: [PATCH] filemap: Handle error return from __filemap_get_folio()

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Wed, 10 May 2023 16:33:26 -0500

On Wed, May 10, 2023 at 3:27 PM Peter Xu <peterx@xxxxxxxxxx> wrote:
>
> The change looks all right to me.

I hate how this path has turned very architecture-specific again, and
this patch makes it worse.

For a short while we shared a lot of the code between architectures
thanks to the new fault helpers (ie fault_signal_pending() and
friends). Now CONFIG_PER_VMA_LOCK has made a mockery of that, and this
patch makes it worse by just fixing x86.

As is historically always the case.

And I think that patch is actually buggy. Because it does that

     ret = VM_FAULT_SIGBUS;
     if (fpin)
             goto out_retry;

in the fault path, and now returns VM_FAULT_SIGBUS | VM_FAULT_RETRY as a result.

And there are a lot of users that will be *very* unhappy about that combination.

Look at mm/gup.c, for example. It will do

        if (ret & VM_FAULT_ERROR) {
                int err = vm_fault_to_errno(ret, *flags);

                if (err)
                        return err;
       ...

and not react to VM_FAULT_RETRY being set at all - so it won't clear
the "locked" variable, for example.

And that all just makes it very clear how much of a painpoint that
conditional lock release is. It's horrendous.

That's always a huge pain in general, and it really complicates how
things get handled. It has very real and obvious historical reasons
("we never released the lock" being converted one case at a time to
"we release the lock in this situation and set the bit to tell you"),
but I wonder if we should just say "handle_mm_fault() _always_
releases the lock".

We'd still keep the RETRY bit as a "this did not complete, you need to
retry", but at least the whole secondary meaning of "oh, and if it
isn't set, you need to release the lock you took" would go away.
Because RETRY has really annoying semantics today.

Again - I know exactly _why_ it has those horrendous semantics, and
I'm not blaming anybody, but it makes it really painful to deal with
15 different architectures that then have to deal with those things.

How painful would it be to try to take baby steps and remove those
kinds of things and slowly move towards a situation where RETRY isn't
such a magically special bit?

Peter Xu, you did a lot of the helper function cleanups, and moved
some of the accounting code into generic code. It's a few years ago,
but maybe you still remember this area.. And Luto, I'm adding you to
the participants because you've touched that code more than most.

Could we make the rule be that handle_mm_fault() just *always*
releases the lock? How painful would that be?

For many of the architectures, I *think* it would mean just removing the

        mmap_read_unlock(mm);

in the fault handling path (after the RETRY test).

But there's a couple of other uses of handle_mm_fault() too, notably GUP.

Am I just dreaming? Or could we try to simplify this area first?
Because I think Johannes' patch right now is quite buggy, and only
converted one caller, when a lot of other callers will currently find
it very problematic to see both VM_FAULT_RETRY and one of the error
bits..

It's also possible that I'm misreading things. But I really think the
lock handling would be a lot easier if the rule was "it is locked on
entry, it's always unlocked on exit".

Of course, with the vma locking, the "it" above is nontrivial too.
That sure didn't simplify thinking about this all....

              Linus