On 12.07.23 01:45, Suren Baghdasaryan wrote:
On Tue, Jul 11, 2023 at 3:21 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
On Tue, Jul 11, 2023 at 02:55:20PM -0700, Suren Baghdasaryan wrote:
On Tue, Jul 11, 2023 at 12:21 AM Dan Carpenter <dan.carpenter@xxxxxxxxxx> wrote:
Hello Suren Baghdasaryan,
The patch 1c71222e5f23: "mm: replace vma->vm_flags direct
modifications with modifier calls" from Jan 26, 2023, leads to the
following Smatch static checker warning:
./include/linux/mm.h:729 vma_start_write()
warn: sleeping in atomic context
include/linux/mm.h
722 static inline void vma_start_write(struct vm_area_struct *vma)
723 {
724 int mm_lock_seq;
725
726 if (__is_vma_write_locked(vma, &mm_lock_seq))
727 return;
728
--> 729 down_write(&vma->vm_lock->lock);
730 vma->vm_lock_seq = mm_lock_seq;
731 up_write(&vma->vm_lock->lock);
732 }
The call tree is:
gru_fault() <- disables preempt
-> remap_pfn_range()
-> track_pfn_remap()
-> remap_pfn_range_notrack()
-> vm_flags_set()
-> vma_start_write()
Before track_pfn_remap() and remap_pfn_range_notrack() would just do |=
to set the flags but now they use vm_flags_set() so there is a potential
they could sleep.
Hi Dan,
Thanks for reporting! Looks like the page fault handler is modifying
the VMA flags, which has to be done under write-locked mmap_lock and I
don't see that being done here... I wonder if that should be allowed.
I'm CC'ing some MM folks to check if this is a valid VMA modification
and should be allowed. Matthew, this might be especially interesting
for you since gru_fault() handles file-backed page faults AFAIKT.
I don't run the ->fault handler under RCU, only the ->map_pages()
method. I don't intend to change that.
Back to the issue at hand. If such modification should be indeed
allowed then the simplest fix I think would be to add new
remap_pfn_range_locked() function to be called from gru_fault() which
would use __vm_flags_mod() instead of vm_flags_set(). __vm_flags_mod()
does not lock the VMA, so would not have this issue. If the conclusion
is that this is a valid scenario then I can post a fix I described.
I'm not certain, but calling remap_pfn_range() in the fault handler
is definitely weird. It's normally called _instead_ of having a fault
handler. The fault handler usually calls set_pte_at() directly.
Hmm. Is it weird enough to be considered invalid or weird but still ok?
Also, is it ok to modify VMA flags here without write-locking the
mmap_lock (and without write-locking the VMA)? The fault handler is
done under read-locked mmap_lock but I thought VMA modifications
require stronger locking...
The "easy" fix would be to have something like "remap_pfn_range_prepare"
that the remap_pfn_range() caller calls during mmap().
We can let that one set these flags, and then we can later let
remap_pfn_range() fail if the relevant flags are not set.
It's certainly one of these "always done like that and suboptimal but
somehow it worked" thingies.
I suspect, because only the first pagefault->remap_pfn_range() will
actually modify the VMA flags, that this is ok. Otherwise, there usually
wouldn't be any pagetables and nothing mapped, so who really cares about
these VMA flags (e.g., GUP cannot pin anything if nothing is mapped).
--
Cheers,
David / dhildenb