On Thu, May 23, 2024 at 02:41:51AM +0100, Matthew Wilcox wrote: > On Tue, May 14, 2024 at 05:22:08PM +0200, Johannes Thumshirn wrote: > > From: Johannes Thumshirn <johannes.thumshirn@xxxxxxx> > > > > Move reading of the on-disk superblock from page to kmalloc()ed memory. > > No, this is wrong. > > > + super = kzalloc(ZONEFS_SUPER_SIZE, GFP_KERNEL); > > + if (!super) > > return -ENOMEM; > > > > + folio = virt_to_folio(super); > > This will stop working at some point. It'll return NULL once we get > to the memdesc future (because the memdesc will be a slab, not a folio). Hmmm, xfs_buf.c plays a similar trick here for sub-page buffers. I'm assuming that will get ported to ... whatever the memdesc future holds? > > bio_init(&bio, sb->s_bdev, &bio_vec, 1, REQ_OP_READ); > > bio.bi_iter.bi_sector = 0; > > - __bio_add_page(&bio, page, PAGE_SIZE, 0); > > + bio_add_folio_nofail(&bio, folio, ZONEFS_SUPER_SIZE, > > + offset_in_folio(folio, super)); > > It also doesn't solve the problem of trying to read 4KiB from a device > with 16KiB sectors. We'll have to fail the bio because there isn't > enough memory in the bio to store one block. > > I think the right way to handle this is to call read_mapping_folio(). > That will allocate a folio in the page cache for you (obeying the > minimum folio size). Then you can examine the contents. It should > actually remove code from zonefs. Don't forget to call folio_put() > when you're done with it (either at unmount or at the end of mount if > you copy what you need elsewhere). The downside of using bd_mapping is that userspace can scribble all over the folio contents. For zonefs that's less of a big deal because it only reads it once, but for everyone else (e.g. ext4) it's been a huge problem. I guess you could always do max(ZONEFS_SUPER_SIZE, block_size(sb->s_bdev)) if you don't want to use the pagecache. <more mumbling about buffer caches> --D