On 28.04.23 19:13, Lorenzo Stoakes wrote:
On Fri, Apr 28, 2023 at 07:05:38PM +0200, David Hildenbrand wrote:
On 28.04.23 19:01, Lorenzo Stoakes wrote:
On Fri, Apr 28, 2023 at 06:51:46PM +0200, David Hildenbrand wrote:
On 28.04.23 18:39, Peter Xu wrote:
On Fri, Apr 28, 2023 at 07:22:07PM +0300, Kirill A . Shutemov wrote:
On Fri, Apr 28, 2023 at 06:13:03PM +0200, David Hildenbrand wrote:
On 28.04.23 18:09, Kirill A . Shutemov wrote:
On Fri, Apr 28, 2023 at 05:43:52PM +0200, David Hildenbrand wrote:
On 28.04.23 17:34, David Hildenbrand wrote:
On 28.04.23 17:33, Lorenzo Stoakes wrote:
On Fri, Apr 28, 2023 at 05:23:29PM +0200, David Hildenbrand wrote:
Security is the primary case where we have historically closed uAPI
items.
As this patch
1) Does not tackle GUP-fast
2) Does not take care of !FOLL_LONGTERM
I am not convinced by the security argument in regard to this patch.
If we want to sells this as a security thing, we have to block it
*completely* and then CC stable.
Regarding GUP-fast, to fix the issue there as well, I guess we could do
something similar as I did in gup_must_unshare():
If we're in GUP-fast (no VMA), and want to pin a !anon page writable,
fallback to ordinary GUP. IOW, if we don't know, better be safe.
How do we determine it's non-anon in the first place? The check is on the
VMA. We could do it by following page tables down to folio and checking
folio->mapping for PAGE_MAPPING_ANON I suppose?
PageAnon(page) can be called from GUP-fast after grabbing a reference.
See gup_must_unshare().
IIRC, PageHuge() can also be called from GUP-fast and could special-case
hugetlb eventually, as it's table while we hold a (temporary) reference.
Shmem might be not so easy ...
page->mapping->a_ops should be enough to whitelist whatever fs you want.
The issue is how to stabilize that from GUP-fast, such that we can safely
dereference the mapping. Any idea?
At least for anon page I know that page->mapping only gets cleared when
freeing the page, and we don't dereference the mapping but only check a
single flag stored alongside the mapping. Therefore, PageAnon() is fine in
GUP-fast context.
What codepath you are worry about that clears ->mapping on pages with
non-zero refcount?
I can only think of truncate (and punch hole). READ_ONCE(page->mapping)
and fail GUP_fast if it is NULL should be fine, no?
I guess we should consider if the inode can be freed from under us and the
mapping pointer becomes dangling. But I think we should be fine here too:
VMA pins inode and VMA cannot go away from under GUP.
Can vma still go away if during a fast-gup?
So, after we grabbed the page and made sure the the PTE didn't change (IOW,
the PTE was stable while we processed it), the page can get unmapped (but
not freed, because we hold a reference) and the VMA can theoretically go
away (and as far as I understand, nothing stops the file from getting
deleted, truncated etc).
So we might be looking at folio->mapping and the VMA is no longer there.
Maybe even the file is no longer there.
This shouldn't be an issue though right? Because after a pup call unlocks the
mmap_lock we're in the same situation anyway. GUP doesn't generally guarantee
the mapping remains valid, only pinning the underlying folio.
Yes. But the issue here is rather dereferencing something that has already
been freed, eventually leading to undefined behavior.
Is that an issue with interrupts disabled though? Will block page tables being
removed and as Kirill says (sorry I maybe misinterpreted you) we should be ok.
Let's rule out page table freeing. If our VMA only spans a single page
and falls into the same PMD as another VMA, an munmap() would not even
free a single page table.
However, if unmapping a page (flushing the TLB) would imply an IPI as
Kirill said, we'd be fine. I recall that that's not the case for all
architectures, but I might be just wrong.
... and now I'll stop reading mails until Tuesday :)
--
Thanks,
David / dhildenb