On 2/6/23 02:00, Hans Holmberg wrote:
I think we're missing a flexible way of routing random-ish
write workloads on to zoned storage devices. Implementing a UBLK
target for this would be a great way to provide zoned storage
benefits to a range of use cases. Creating UBLK target would
enable us experiment and move fast, and when we arrive
at a common, reasonably stable, solution we could move this into
the kernel.
We do have dm-zoned [3]in the kernel, but it requires a bounce
on conventional zones for non-sequential writes, resulting in a write
amplification of 2x (which is not optimal for flash).
Fully random workloads make little sense to store on ZBDs as a
host FTL could not be expected to do better than what conventional block
devices do today. Fully sequential writes are also well taken care of
by conventional block devices.
The interesting stuff is what lies in between those extremes.
I would like to discuss how we could use UBLK to implement a
common FTL with the right knobs to cater for a wide range of workloads
that utilize raw block devices. We had some knobs in the now-dead pblk,
a FTL for open channel devices, but I think we could do way better than that.
Pblk did not require bouncing writes and had knobs for over-provisioning and
workload isolation which could be implemented. We could also add options
for different garbage collection policies. In userspace it would also
be easy to support default block indirection sizes, reducing logical-physical
translation table memory overhead.
Use cases for such an FTL includes SSD caching stores such as Apache
traffic server [1] and CacheLib[2]. CacheLib's block cache and the apache
traffic server storage workloads are *almost* zone block device compatible
and would need little translation overhead to perform very well on e.g.
ZNS SSDs.
There are probably more use cases that would benefit.
It would also be a great research vehicle for academia. We've used dm-zap
for this [4] purpose the last couple of years, but that is not production-ready
and cumbersome to improve and maintain as it is implemented as a out-of-tree
device mapper.
ublk adds a bit of latency overhead, but I think this is acceptable at least
until we have a great, proven solution, which could be turned into
an in-kernel FTL.
If there is interest in the community for a project like this, let's talk!
cc:ing the folks who participated in the discussions at ALPSS 2021 and last
years' plumbers on this subject.
Thanks,
Hans
[1] https://trafficserver.apache.org/
[2] https://cachelib.org/
[3] https://docs.kernel.org/admin-guide/device-mapper/dm-zoned.html
[4] https://github.com/westerndigitalcorporation/dm-zap
Hi Hans,
Which functionality would such a user space target provide that is not
yet provided by BTRFS, F2FS or any other log-structured filesystem that
supports zoned block devices?
Thanks,
Bart.