Re: [PATCH 0/11] Update version of write stream ID patchset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 03/04/2016 03:03 PM, Jeff Moyer wrote:
Jens Axboe <axboe@xxxxxx> writes:

On 03/04/2016 02:01 PM, Jeff Moyer wrote:
OK.  I'm still of the opinion that we should try to make this
transparent.  I could be swayed by workload descriptions and numbers
comparing approaches, though.

You can't just waive that flag and not have a solution. Any solution
in that space would imply having policy in the kernel. A "just use a
stream per file" is never going to work.

Jens, I'm obviously missing a lot of the background information, here.
I want to stress that I'm not against your patches. I'm just trying to
understand if there's a sensible way to use the write stream support in
the kernel so that applcations don't /have/ to be converted.  It sounds
like that's hard, and without any specs or hardware, I'm not going to be
able to even try to come up with solutions to that problem.

It's not hard to update an application to do this. As an example, one thing I tried was converting RocksDB to use streams. A naive approach was used, where we simply mapped each compaction level to a specific stream, and got about a 30% reduction in WA just through that. The guys from Samsung has done that with RocksDB as well, just a bit more involved, and got better results. The application change was really no more involved than calling fadvise() on the fd after opening it. That is it. I don't know why you think that is hard.

As to doing this automagically, you'll need knowledge that you do not have. The kernel or file system has no idea if data written to file X and file Y have similar life times. You could start tracking that, of course, but that would make you very unhappy. If I'm an application storing files, I have a much better idea of what is related time wise.

And you don't really need a spec to understand how this works, the spec will just tell you the mechanics of how we pass this information to the device, how we find out what the device can support, etc. The basic gist of it is that we can write data with similar life times to the right place on media. For a flash disk, that would be the same EB.

I think it
would make for interesting research, though.  I recall a paper from one
of the USENIX conferences that dealt with automatically identifying
write streams on a network storage server, but alas, I can't find the
reference right now.

Samsung released a paper on RocksDB and streams, iirc.


--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux