Jeff, On 2019/07/18 23:11, Jeff Moyer wrote: > Hi, Damien, > > Did you consider creating a shared library? I bet that would also > ease application adoption for the use cases you're interested in, and > would have similar performance. > > -Jeff Yes, it would, but to a lesser extent since system calls would need to be replaced with library calls. Earlier work on LevelDB by Ting used the library approach with libzbc, not quite a "libzonefs" but close enough. Working with LevelDB code gave me the idea for zonefs. Compared to a library, the added benefits are that specific language bindings are not a problem and further simplify the code changes needed to support zoned block devices. In the case of LevelDB for instance, C++ is used and file accesses are using streams, which makes using a library a little difficult, and necessitates more changes just for the internal application API itself. The needed changes spread beyond the device access API. This is I think the main advantage of this simple in-kernel FS over a library: the developer can focus on zone block device specific needs (write sequential pattern and garbage collection) and forget about the device access parts as the standard system calls API can be used. Another approach I considered is using FUSE, but went for a regular (albeit simple) in-kernel approach due to performance concerns. While any difference in performance for SMR HDDs would probably not be noticeable, performance would likely be lower for upcoming NVMe zonenamespace devices compared to the in-kernel approach. But granted, most of the arguments I can put forward for an in-kernel FS solution vs a user shared library solution are mostly subjective. I think though that having support directly provided by the kernel brings zoned block devices into the "mainstream storage options" rather than having them perceived as fringe solutions that need additional libraries to work correctly. Zoned block devices are not going away and may in fact become more mainstream as implementing higher capacities more and more depends on the sequential write interface. Best regards. -- Damien Le Moal Western Digital Research