Re: Kernel Benchmarking

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Thu, 17 Sep 2020 19:50:49 +0100

On Thu, Sep 17, 2020 at 11:30:00AM -0700, Linus Torvalds wrote:
> On Thu, Sep 17, 2020 at 11:23 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> >
> >             Something like taking
> > the i_mmap_lock_read(file->f_mapping) in filemap_fault, then adding a
> > new VM_FAULT_I_MMAP_LOCKED bit so that do_read_fault() and friends add:
> >
> >         if (ret & VM_FAULT_I_MMAP_LOCKED)
> >                 i_mmap_unlock_read(vmf->vma->vm_file->f_mapping);
> >         else
> >                 unlock_page(page);
> >
> > ... want me to turn that into a real patch?
> 
> I can't guarantee it's the right model - it does worry me how many
> places we might get that i_mmap_rwlock, and how long we migth hold it
> for writing, and what deadlocks it might cause when we take it for
> reading in the page fault path.
> 
> But I think it might be very interesting as a benchmark patch and a
> trial balloon. Maybe it "just works".

Ahh.  Here's a race this doesn't close:

int truncate_inode_page(struct address_space *mapping, struct page *page)
{
        VM_BUG_ON_PAGE(PageTail(page), page);

        if (page->mapping != mapping)
                return -EIO;

        truncate_cleanup_page(mapping, page);
        delete_from_page_cache(page);
        return 0;
}

truncate_cleanup_page() does
        if (page_mapped(page)) {
                pgoff_t nr = PageTransHuge(page) ? HPAGE_PMD_NR : 1;
                unmap_mapping_pages(mapping, page->index, nr, false);
        }

but ->mapping isn't cleared until delete_from_page_cache() many
instructions later.  So we can get the lock and have a page which appears
to be not-truncated, only for it to get truncated on us later.

> I would _love_ for the page lock itself to be only (or at least
> _mainly_) about the actual IO synchronization on the page.
> 
> That was the origin of it, the whole "protect all the complex state of
> a page" behavior kind of grew over time, since it was the only
> per-page lock we had.

Yes, I think that's a noble goal.