On Thu, Oct 17, 2024 at 09:15:21AM -0700, Bart Van Assche wrote: > On 10/17/24 8:44 AM, Keith Busch wrote: > > On Thu, Oct 17, 2024 at 05:23:37PM +0200, Christoph Hellwig wrote: > > > If you want to do useful stream separation you need to write data > > > sequentially into the stream. Now with streams or FDP that does not > > > actually imply sequentially in LBA space, but if you want the file > > > system to not actually deal with fragmentation from hell, and be > > > easily track what is grouped together you really want it sequentially > > > in the LBA space as well. In other words, any kind of write placement > > > needs to be intimately tied to the file system block allocator. > > > > I'm replying just to make sure I understand what you're saying: > > > > If we send per IO hints on a file, we could have interleaved hot and > > cold pages at various offsets of that file, so the filesystem needs an > > efficient way to allocate extents and track these so that it doesn't > > interleave these in LBA space. I think that makes sense. > > > > We can add a fop_flags and block/fops.c can be the first one to turn it > > on since that LBA access is entirely user driven. > > Does anyone care about buffered I/O to block devices? When using > buffered I/O, the write_hint information from the inode is used and the per > I/O write_hint information is ignored. I'm pretty sure there are applications that use buffered IO on raw block (ex: postgresql), but it's a moot point: the block file_operations that provide the fops_flags also provide the callbacks for O_DIRECT, which is where this matters. We can't really use per-io write_hints on buffered-io. At least not yet, and maybe never. I'm not sure if it makes sense for raw block because the page writes won't necessarily match writes to storage.