在 2024/6/28 7:43, Peter Xu 写道:
On Thu, Jun 27, 2024 at 04:32:42PM -0700, Andrew Morton wrote:
On Thu, 27 Jun 2024 19:19:40 -0400 Peter Xu <peterx@xxxxxxxxxx> wrote:
Yang,
On Thu, Jun 27, 2024 at 03:14:13PM -0700, Yang Shi wrote:
The try_grab_folio() is supposed to be used in fast path and it elevates
folio refcount by using add ref unless zero. We are guaranteed to have
at least one stable reference in slow path, so the simple atomic add
could be used. The performance difference should be trivial, but the
misuse may be confusing and misleading.
This first paragraph is IMHO misleading itself..
I think we should mention upfront the important bit, on the user impact.
Here IMO the user impact should be: Linux may fail longterm pin in some
releavnt paths when applied over CMA reserved blocks. And if to extend a
bit, that include not only slow-gup but also the new memfd pinning, because
both of them used try_grab_folio() which used to be only for fast-gup.
It's still unclear how users will be affected. What do the *users*
see? If it's a slight slowdown, do we need to backport this at all?
The user will see the pin fails, for gpu-slow it further triggers the WARN
right below that failure (as in the original report):
folio = try_grab_folio(page, page_increm - 1,
foll_flags);
if (WARN_ON_ONCE(!folio)) { <------------------------ here
/*
* Release the 1st page ref if the
* folio is problematic, fail hard.
*/
gup_put_folio(page_folio(page), 1,
foll_flags);
ret = -EFAULT;
goto out;
}
For memfd pin and hugepd paths, they should just observe GUP failure on
those longterm pins, and it'll be the caller context to decide what user
can see, I think.
The patch itself looks mostly ok to me.
There's still some "cleanup" part mangled together, e.g., the real meat
should be avoiding the folio_is_longterm_pinnable() check in relevant
paths. The rest (e.g. switch slow-gup / memfd pin to use folio_ref_add()
not try_get_folio(), and renames) could be good cleanups.
So a smaller fix might be doable, but again I don't have a strong opinion
here.
The smaller the better for backporting, of course.
I think a smaller version might be yangge's patch, plus Yang's hugepd
"fast" parameter for the hugepd stack, then hugepd can also use
try_grab_page(). memfd-pin change can be a separate small patch perhaps
squashed.
Thanks, new version:
https://lore.kernel.org/all/1719554518-11006-1-git-send-email-yangge1116@xxxxxxx/
I'll leave how to move on to Yang.
Thanks,