On Wed, Nov 20, 2024 at 09:21:58AM -0800, Darrick J. Wong wrote: > > How do filesystem users pick a write stream? I get a pretty strong > sense that you're aiming to provide the ability for application software > to group together a bunch of (potentially arbitrary) files in a cohort. > Then (maybe?) you can say "This cohort of files are all expected to have > data blocks related to each other in some fashion, so put them together > so that the storage doesn't have to work so hard". > > Part of my comprehension problem here (and probably why few fs people > commented on this thread) is that I have no idea what FDP is, or what > the write lifetime hints in scsi were/are, or what the current "hinting" > scheme is. FDP is just the "new" version of NVMe's streams. Support for its predecessor was added in commit f5d118406247acf ("nvme: add support for streams") https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f5d118406247acfc4fc481e441e01ea4d6318fdc Various applications were written to that interface and showed initial promise, but production quality hardware never materialized. Some of these applications are still setting the write hints today, and the filesystems are all passing through the block stack, but there's just currently no nvme driver listening on the other side. Contrast to the older nvme streams, capable hardware subscribing to this newer FDP scheme have been developed, and so people want to use those same applications using those same hints in the exact same way that it was originally designed. Enabling them could be just be a simple driver patch like the above without bothering the filesystem people :) > Is this what we're arguing about? > > enum rw_hint { > WRITE_LIFE_NOT_SET = RWH_WRITE_LIFE_NOT_SET, > WRITE_LIFE_NONE = RWH_WRITE_LIFE_NONE, > WRITE_LIFE_SHORT = RWH_WRITE_LIFE_SHORT, > WRITE_LIFE_MEDIUM = RWH_WRITE_LIFE_MEDIUM, > WRITE_LIFE_LONG = RWH_WRITE_LIFE_LONG, > WRITE_LIFE_EXTREME = RWH_WRITE_LIFE_EXTREME, > } __packed; > > (What happens if you have two disjoint sets of files, both of which are > MEDIUM, but they shouldn't be intertwined?) It's not going to perform as well. You'd be advised against over subscribing the hint value among applications with different relative expectations, but it generally (but not always) should be no worse than if you hadn't given any hints at all. > Or are these new fdp hint things an overload of the existing write hint > fields in the iocb/inode/bio? With a totally different meaning from > anticipated lifetime of the data blocks? The meaning assigned to an FDP stream is whatever the user wants it to mean. It's not strictly a lifetime hint, but that is certainly a valid way to use them. The contract on the device's side is that writes to one stream won't create media interfere or contention with writes to other streams. This is the same as nvme's original streams, which for some reason did not carry any of this controversy.