On Thu, Sep 26, 2019 at 03:20:42PM -0700, Linus Torvalds wrote: > On Thu, Sep 26, 2019 at 1:55 PM Thomas Hellström (VMware) > <thomas_os@xxxxxxxxxxxx> wrote: > > > > Well, we're working on supporting huge puds and pmds in the graphics > > VMAs, although in the write-notify cases we're looking at here, we would > > probably want to split them down to PTE level. > > Well, that's what the existing walker code does if you don't have that > "pud_entry()" callback. > > That said, I assume you would *not* want to do that if the huge > pud/pmd is already clean and read-only, but just continue. > > So you may want to have a special pud_entry() that handles that case. > Eventually. Maybe. Although honestly, if you're doing dirty tracking, > I doubt it makes much sense to use largepages. > > > Looking at zap_pud_range() which when called from unmap_mapping_pages() > > uses identical locking (no mmap_sem), it seems we should be able to get > > away with i_mmap_lock(), making sure the whole page table doesn't > > disappear under us. So it's not clear to me why the mmap_sem is strictly > > needed here. Better to sort those restrictions out now rather than when > > huge entries start appearing. > > zap_pud_range()actually does have that > > VM_BUG_ON_VMA(!rwsem_is_locked(&tlb->mm->mmap_sem), vma); The VM_BUG is a blind copy from PMD layer and it's bogus. i_mmap_lock() works fine for file mappings. The PMD was intended for THP case at the time when there were only anon-THP. The check was relaxed and later dropped for file-THP on PMD level. It has to be dropped on PUD too. We don't have anon-THP on PUD level at all, only DAX played with them. -- Kirill A. Shutemov