On Sun, Jun 09, 2024 at 08:55:06AM -0700, Darrick J. Wong wrote: > HTML version here, text version below. That is so much nicer than all the RST stuff.. > iomap is a filesystem library for handling various filesystem > operations that involves mapping of file's logical offset ranges > to physical extents. This origins of this library is the file I/O > path that XFS once used; it has now been extended to cover several > other operations. The library provides various APIs for > implementing various file and pagecache operations, such as: Does anyone care about the origin? > > * Pagecache reads and writes > > * Folio write faults to the pagecache > > * Writeback of dirty folios > > * Direct I/O reads and writes > > * FIEMAP > > * lseek SEEK_DATA and SEEK_HOLE > > * swapfile activation One useful bit might be that there are two layer in iomap. 1) the very simply underlying layer in iter.c that just provides a nicer iteration over logical file offsets 2) anything built on top. That's the things mentioned above plus DAX. What is also kinda interesting as it keeps confusing people is that nothing in the iterator is block device specific. In fact the DAX code now has no block device dependencies, as does the lseek and FIEMAP code. Because of that it might make sense to split this document up a bit for the different layers and libraries. Or maybe not if too many documents are too confusing. > 2. For each sub-unit of work... > > 1. Revalidate the mapping and go back to (1) above, if > necessary That's something only really done in the buffered write path. > Each iomap operation will be covered in more detail below. This > library was covered previously by an LWN article and a > KernelNewbies page. Maybe these are links in other formats, but if not this information isn't very useful. Depending on how old that information is it probably isn't even with links. > The filesystem returns the mappings via the following structure. > For documentation purposes, the structure has been reordered to > group fields that go together logically. I don't think putting a different layout in here is a good idea. In fact duplicating the definition means it will be out of sync rather sooner than later. Given that we have to deal with RST anyway we might as well want to pull this in as kerneldoc comments. And maybe reorder the actual definition while we're at it, as the version below still packs nicely. > struct block_device *bdev; > struct dax_device *dax_dev; > void *inline_data; Note: The could become a union these days. I tried years ago before fully decoupling the DAX code and that didn't work, but we should be fine now. > * type describes the type of the space mapping: > > * IOMAP_HOLE: No storage has been allocated. This type > must never be returned in response to an IOMAP_WRITE > operation because writes must allocate and map space, > and return the mapping. The addr field must be set to > IOMAP_NULL_ADDR. iomap does not support writing > (whether via pagecache or direct I/O) to a hole. ... These should probably also be kerneldoc comments instead of being away from the definitions? > > * IOMAP_F_XATTR: The mapping is for extended attribute > data, not regular file data. This is only useful for > FIEMAP. .. and only used inside XFS. Maybe we should look into killing it. > These struct kiocb flags are significant for buffered I/O with > iomap: > > * IOCB_NOWAIT: Only proceed with the I/O if mapping data are > already in memory, we do not have to initiate other I/O, and > we acquire all filesystem locks without blocking. Neither > this flag nor its definition RWF_NOWAIT actually define what > this flag means, so this is the best the author could come > up with. I don't think that's true. But if it feels true to you submitting a patch to describe it better is probably more helpful than this. > iomap internally tracks two state bits per fsblock: > > * uptodate: iomap will try to keep folios fully up to date. If > there are read(ahead) errors, those fsblocks will not be > marked uptodate. The folio itself will be marked uptodate > when all fsblocks within the folio are uptodate. > > * dirty: iomap will set the per-block dirty state when > programs write to the file. The folio itself will be marked > dirty when any fsblock within the folio is dirty. > > iomap also tracks the amount of read and write disk IOs that are > in flight. This structure is much lighter weight than struct > buffer_head. Is this really something that should go into an API documentation? Note that the structure not only is lighter weight than a buffer_head, but more importantly there are a lot less of them as there is only one per folio and not one per FSB. > Why Convert to iomap? Make this a separate document?