On 10:47 07/10, Darrick J. Wong wrote: > On Sat, Oct 05, 2024 at 03:30:53AM +0100, Matthew Wilcox wrote: > > On Fri, Oct 04, 2024 at 04:04:33PM -0400, Goldwyn Rodrigues wrote: > > > Introduce read_inline() function hook for reading inline extents. This > > > is performed for filesystems such as btrfs which may compress the data > > > in the inline extents. > > Why don't you set iomap->inline_data to the uncompressed buffer, let > readahead copy it to the pagecache, and free it in ->iomap_end? This will increase the number of copies. BTRFS uncompresses directly into pagecache. Yes, this is an option but at the cost of efficiency. > > > This feels like an attempt to work around "iomap doesn't support > > compressed extents" by keeping the decompression in the filesystem, > > instead of extending iomap to support compressed extents itself. > > I'd certainly prefer iomap to support compressed extents, but maybe I'm > > in a minority here. > > I'm not an expert on fs compression, but I get the impression that most > filesystems handle reads by allocating some folios, reading off the disk > into those folios, and decompressing into the pagecache folios. It > might be kind of amusing to try to hoist that into the vfs/iomap at some > point, but I think the next problem you'd run into is that fscrypt has > similar requirements, since it's also a data transformation step. > fsverity I think is less complicated because it only needs to read the > pagecache contents at the very end to check it against the merkle tree. > > That, I think, is why this btrfs iomap port want to override submit_bio, > right? So that it can allocate a separate set of folios, create a > second bio around that, submit the second bio, decode what it read off > the disk into the folios of the first bio, then "complete" the first bio > so that iomap only has to update the pagecache state and doesn't have to > know about the encoding magic? Yes, but that is not the only reason. BTRFS also calculates and checks block checksums for data read during bio completions. > > And then, having established that beachhead, porting btrfs fscrypt is > a simple matter of adding more transformation steps to the ioend > processing of the second bio (aka the one that actually calls the disk), > right? And I think all that processing stuff is more or less already in > place for the existing read path, so it should be trivial (ha!) to > call it in an iomap context instead of straight from btrfs. > iomap_folio_state notwithstanding, of course. > > Hmm. I'll have to give some thought to what would the ideal iomap data > transformation pipeline look like? The order of transformation would make all the difference, and I am not sure if everyone involved can come to a conclusion that all transformations should be done in a particular decided order. FWIW, checksums are performed on what is read/written on disk. So for writes, compression happens before checksums. > > Though I already have a sneaking suspicion that will morph into "If I > wanted to add {crypt,verity,compression} to xfs how would I do that?" -> > "How would I design a pipeine to handle all three to avoid bouncing > pages between workqueue threads like ext4 does?" -> "Oh gosh now I have > a totally different design than any of the existing implementations." -> > "Well, crumbs. :(" > > I'll start that by asking: Hey btrfs developers, what do you like and > hate about the current way that btrfs handles fscrypt, compression, and > fsverity? Assuming that you can set all three on a file, right? -- Goldwyn