On 1/19/23 7:23 AM, Ming Lei wrote: > Hi, > > ublk-nbd[1] is available now. > > Basically it is one nbd client, but totally implemented in userspace, > and wrt. current nbd-client in [2], the transmission phase is done > by linux block nbd driver. > > The handshake implementation is borrowed from nbd project[2], so > basically ublk-nbd just adds new code for implementing transmission > phase, and it can be thought as moving linux block nbd driver into > userspace. > > The added new code is basically in nbd/tgt_nbd.cpp, and io handling > is based on liburing[3], and implemented by c++20 coroutine, so > everything is done in single pthread totally lockless, meantime turns > out it is pretty easy to design & implement, attributed to ublk framework, > c++20 coroutine and liburing. > > ublk-nbd supports both tcp and unix socket, and allows to enable io_uring > send zero copy via command line '--send_zc', see details in README[4]. > > No regression is found in xfstests by using ublk-nbd as both test device > and scratch device, and builtin test(make test T=nbd) runs well. > > Fio test("make test T=nbd") shows that ublk-nbd performance is > basically same with nbd-client/nbd driver when running fio on real > ethernet link(1g, 10+g), but ublk-nbd IOPS is higher by ~40% than > nbd-client(nbd driver) with 512K BS, which is because linux nbd > driver sets max_sectors_kb as 64KB at default. > > But when running fio over local tcp socket, it is observed in my test > machine that ublk-nbd performs better than nbd-client/nbd driver, > especially with 2 queue/2 jobs, and the gap could be 10% ~ 30% > according to different block size. This is pretty nice! Just curious, have you tried setting up your ring with p.flags |= IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN; and see if that yields any extra performance improvements for you? Depending on how you do processing, you should not need to do any further changes there. A "lighter" version is just setting IORING_SETUP_COOP_TASKRUN. -- Jens Axboe