Re: [PATCH 11/35] nilfs2: Convert nilfs_page_mkwrite() to use a folio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 09, 2023 at 10:11:27PM +0900, Ryusuke Konishi wrote:
> On Tue, Nov 7, 2023 at 2:39 AM Matthew Wilcox (Oracle) wrote:
> >
> > Using the new folio APIs saves seven hidden calls to compound_head().
> >
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx>
> > ---
> >  fs/nilfs2/file.c | 28 +++++++++++++++-------------
> >  1 file changed, 15 insertions(+), 13 deletions(-)
> 
> I'm still in the middle of reviewing this series, but I had one
> question in a relevant part outside of this patch, so I'd like to ask
> you a question.
> 
> In block_page_mkwrite() that nilfs_page_mkwrite() calls,
> __block_write_begin_int() was called with the range using
> folio_size(), as shown below:
> 
>         end = folio_size(folio);
>         /* folio is wholly or partially inside EOF */
>         if (folio_pos(folio) + end > size)
>                 end = size - folio_pos(folio);
> 
>         ret = __block_write_begin_int(folio, 0, end, get_block, NULL);
>         ...
> 
> On the other hand, __block_write_begin_int() takes a folio as an
> argument, but uses a PAGE_SIZE-based remainder calculation and BUG_ON
> checks:
> 
> int __block_write_begin_int(struct folio *folio, loff_t pos, unsigned len,
>                 get_block_t *get_block, const struct iomap *iomap)
> {
>         unsigned from = pos & (PAGE_SIZE - 1);
>         unsigned to = from + len;
>         ...
>         BUG_ON(from > PAGE_SIZE);
>         BUG_ON(to > PAGE_SIZE);
>         ...
> 
> So, it looks like this function causes a kernel BUG if it's called
> from block_page_mkwrite() and folio_size() exceeds PAGE_SIZE.
> 
> Is this constraint intentional or temporary in folio conversions ?

Good catch!

At the time I converted __block_write_begin_int() to take a folio
(over two years ago), the plan was that filesystems would convert to
the iomap infrastructure in order to take advantage of large folios.

The needs of the Large Block Size project mean that may not happen;
filesystems want to add support for, eg, 16kB hardware block sizes
without converting to iomap.  So we shoud fix up
__block_write_begin_int().  I'll submit a patch along these lines:

diff --git a/fs/buffer.c b/fs/buffer.c
index cd114110b27f..24a5694f9b41 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2080,25 +2080,24 @@ iomap_to_bh(struct inode *inode, sector_t block, struct buffer_head *bh,
 int __block_write_begin_int(struct folio *folio, loff_t pos, unsigned len,
 		get_block_t *get_block, const struct iomap *iomap)
 {
-	unsigned from = pos & (PAGE_SIZE - 1);
-	unsigned to = from + len;
+	size_t from = offset_in_folio(folio, pos);
+	size_t to = from + len;
 	struct inode *inode = folio->mapping->host;
-	unsigned block_start, block_end;
+	size_t block_start, block_end;
 	sector_t block;
 	int err = 0;
 	unsigned blocksize, bbits;
 	struct buffer_head *bh, *head, *wait[2], **wait_bh=wait;
 
 	BUG_ON(!folio_test_locked(folio));
-	BUG_ON(from > PAGE_SIZE);
-	BUG_ON(to > PAGE_SIZE);
+	BUG_ON(to > folio_size(folio));
 	BUG_ON(from > to);
 
 	head = folio_create_buffers(folio, inode, 0);
 	blocksize = head->b_size;
 	bbits = block_size_bits(blocksize);
 
-	block = (sector_t)folio->index << (PAGE_SHIFT - bbits);
+	block = ((loff_t)folio->index * PAGE_SIZE) >> bbits;
 
 	for(bh = head, block_start = 0; bh != head || !block_start;
 	    block++, block_start=block_end, bh = bh->b_this_page) {




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux