Re: [PATCH v7 06/11] filemap: cap PTE range to be created to allowed zero fill in folio_map_range()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 13.06.24 09:57, Luis Chamberlain wrote:
On Wed, Jun 12, 2024 at 08:08:15PM +0100, Matthew Wilcox wrote:
On Fri, Jun 07, 2024 at 02:58:57PM +0000, Pankaj Raghav (Samsung) wrote:
From: Pankaj Raghav <p.raghav@xxxxxxxxxxx>

Usually the page cache does not extend beyond the size of the inode,
therefore, no PTEs are created for folios that extend beyond the size.

But with LBS support, we might extend page cache beyond the size of the
inode as we need to guarantee folios of minimum order. Cap the PTE range
to be created for the page cache up to the max allowed zero-fill file
end, which is aligned to the PAGE_SIZE.

I think this is slightly misleading because we might well zero-fill
to the end of the folio.  The issue is that we're supposed to SIGBUS
if userspace accesses pages which lie entirely beyond the end of this
file.  Can you rephrase this?

(from mmap(2))
        SIGBUS Attempted access to a page of the buffer that lies beyond the end
               of the mapped file.  For an explanation of the treatment  of  the
               bytes  in  the  page that corresponds to the end of a mapped file
               that is not a multiple of the page size, see NOTES.


The code is good though.

Reviewed-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx>

Since I've been curating the respective fstests test to test for this
POSIX corner case [0] I wanted to enable the test for tmpfs instead of
skipping it as I originally had it, and that meant also realizing mmap(2)
specifically says this now:

Huge page (Huge TLB) mappings

Confusion alert: this likely talks about hugetlb (MAP_HUGETLB), not THP and friends.

So it might not be required for below changes.

...
        For mmap(), offset must be a multiple of the underlying huge page
        size. The system automatically aligns length to be a multiple of
        the underlying huge page size.

So do we need to adjust this patch with this:

diff --git a/mm/filemap.c b/mm/filemap.c
index ea78963f0956..9c8897ba90ff 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3617,6 +3617,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
  	vm_fault_t ret = 0;
  	unsigned long rss = 0;
  	unsigned int nr_pages = 0, mmap_miss = 0, mmap_miss_saved, folio_type;
+	unsigned int align = PAGE_SIZE;
rcu_read_lock();
  	folio = next_uptodate_folio(&xas, mapping, end_pgoff);
@@ -3636,7 +3637,10 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
  		goto out;
  	}
- file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE) - 1;
+	if (folio_test_pmd_mappable(folio))
+		align = 1 << folio_order(folio);
+
+	file_end = DIV_ROUND_UP(i_size_read(mapping->host), align) - 1;
  	if (end_pgoff > file_end)
  		end_pgoff = file_end;

[0] https://lore.kernel.org/all/20240611030203.1719072-3-mcgrof@xxxxxxxxxx/

   Luis


--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux