On 6/2/24 02:51, Matthew Wilcox wrote: > On Fri, May 31, 2024 at 10:28:50AM +0900, Damien Le Moal wrote: >>>> This will stop working at some point. It'll return NULL once we get >>>> to the memdesc future (because the memdesc will be a slab, not a folio). >>> >>> Hmmm, xfs_buf.c plays a similar trick here for sub-page buffers. I'm >>> assuming that will get ported to ... whatever the memdesc future holds? > > I don't think it does, exactly? Are you referring to kmem_to_page()? > That will continue to work. You're not trying to get a folio from a > slab allocation; that will start to fail. > >>>> I think the right way to handle this is to call read_mapping_folio(). >>>> That will allocate a folio in the page cache for you (obeying the >>>> minimum folio size). Then you can examine the contents. It should >>>> actually remove code from zonefs. Don't forget to call folio_put() >>>> when you're done with it (either at unmount or at the end of mount if >>>> you copy what you need elsewhere). >>> >>> The downside of using bd_mapping is that userspace can scribble all over >>> the folio contents. For zonefs that's less of a big deal because it >>> only reads it once, but for everyone else (e.g. ext4) it's been a huge >> >> Yes, and zonefs super block is read-only, we never update it after formatting. >> >>> problem. I guess you could always do max(ZONEFS_SUPER_SIZE, >>> block_size(sb->s_bdev)) if you don't want to use the pagecache. >> >> Good point. ZONEFS_SUPER_SIZE is 4K and given that I only know of 512e and 4K >> zoned block devices, this is not an issue yet. But better safe than sorry, so >> doing the max() thing you propose is better. Will patch that. > > I think you should use read_mapping_folio() for now instead of > complicating zonefs. Once there's a grand new buffer cache, switch to > that, but I don't think you're introducing a significant vulnerability > by using the block device's page cache. I was not really thinking about vulnerability here, but rather compatibility with devices having a block size larger than 4K... But given that these are rare (at best), a fix for a more intelligent ZONEFS_SUPER_SIZE is not urgent, and not hard at all anyway. -- Damien Le Moal Western Digital Research