On Mar 4, 2016, at 9:10 AM, Jens Axboe <axboe@xxxxxx> wrote: > > It's been a while since I last posted the write stream ID patchset, but > here is an updated version. > > The original patchset was centered around the current NVMe streams > proposal, but there was a number of issues with that. It's now in a > much beter state, and hopefully will make it into 1.3 of the spec > soon. > > To quickly re-summarize the intent behind write stream IDs, it's to > be able to provide a hint to the underlying storage device on what > writes could feasibly be grouped together. If the device is able to > group writes of similar life times on media, then we can greatly reduce > the amount of data that needs to be copied around at garbage collection > time. This gives us a better write amplification factor, which leads > to better device life times and better (and more predictable) > performance at steady state. What are your thoughts on reserving a small number of the stream ID values for filesystem metadata (e.g. the first 31 since 0 == unused)? One of the requests by several people at FAST was to be able to identify filesystem metadata at the block layer for a variety of reasons (e.g. blktrace, etc). I believe Ted said that Google is doing something similar for IO analysis. For example, something like the following: enum { SID_UNSET = 0, SID_SUPERBLOCK = 1, SID_ALLOCATIONGROUP = 2, SID_BLOCK_BITMAP = 3, SID_INODE_BITMAP = 4, SID_INODE = 5, SID_INTERNAL_TREE = 6, SID_DIRECTORY = 7, SID_JOURNAL = 8, SID_EXTENT = 9, SID_XATTR = 10, SID_DATA_FILE_4KB = 11, SID_DATA_FILE_16KB = 12, SID_DATA_FILE_64KB = 13, SID_DATA_FILE_256KB = 14, SID_DATA_FILE_1MB = 15, SID_DATA_FILE_4MB = 16, SID_DATA_FILE_16MB = 17, SID_DATA_FILE_64MB = 18, SID_DATA_FILE_256MB = 19, SID_DATA_FILE_1GB = 20, SID_DATA_FILE_LARGE = 21, SID_DATA_DIRECT = 22, SID_LAST }; though it would need to be expanded somewhat to include generic metadata types from other filesystems. Cheers, Andreas > There's been a number of changes to this patchset since it was last > posted. In summary: > > 1) The bio parts have been bumped to carry 16 bits of stream data, up > from 8 and 12 in the original series. > > 2) Since the interface grew some more options, I've moved away from > fadvise and instead added a new system call. I don't feel strongly > about what interface we use here, another option would be to have a > (big) set of fcntl() commands instead. > > 3) The kernel now manages the ID space, since we have moved to a host > assigned model. This is done on a backing_dev_info basis, and the > btrfs patch has been updated to show how this can be used for nested > devices on btrfs/md/dm/etc. This could be moved to the request queue > as well, again I don't feel too strongly aboout this specific part. > > Those are the big changes. > > The patches are against Linus' current -git tip. > > -- > Jens Axboe > > Cheers, Andreas
Attachment:
signature.asc
Description: Message signed with OpenPGP using GPGMail