Re: [PATCH] zonefs: move super block reading from page to folio

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Sat, 1 Jun 2024 18:51:45 +0100

On Fri, May 31, 2024 at 10:28:50AM +0900, Damien Le Moal wrote:
> >> This will stop working at some point.  It'll return NULL once we get
> >> to the memdesc future (because the memdesc will be a slab, not a folio).
> > 
> > Hmmm, xfs_buf.c plays a similar trick here for sub-page buffers.  I'm
> > assuming that will get ported to ... whatever the memdesc future holds?

I don't think it does, exactly?  Are you referring to kmem_to_page()?
That will continue to work.  You're not trying to get a folio from a
slab allocation; that will start to fail.

> >> I think the right way to handle this is to call read_mapping_folio().
> >> That will allocate a folio in the page cache for you (obeying the
> >> minimum folio size).  Then you can examine the contents.  It should
> >> actually remove code from zonefs.  Don't forget to call folio_put()
> >> when you're done with it (either at unmount or at the end of mount if
> >> you copy what you need elsewhere).
> > 
> > The downside of using bd_mapping is that userspace can scribble all over
> > the folio contents.  For zonefs that's less of a big deal because it
> > only reads it once, but for everyone else (e.g. ext4) it's been a huge
> 
> Yes, and zonefs super block is read-only, we never update it after formatting.
> 
> > problem.  I guess you could always do max(ZONEFS_SUPER_SIZE,
> > block_size(sb->s_bdev)) if you don't want to use the pagecache.
> 
> Good point. ZONEFS_SUPER_SIZE is 4K and given that I only know of 512e and 4K
> zoned block devices, this is not an issue yet. But better safe than sorry, so
> doing the max() thing you propose is better. Will patch that.

I think you should use read_mapping_folio() for now instead of
complicating zonefs.  Once there's a grand new buffer cache, switch to
that, but I don't think you're introducing a significant vulnerability
by using the block device's page cache.