On Tue, Jul 09, 2024 at 09:02:25AM -0700, James Bottomley wrote: > For NVMe and net we do have SPDK and DPDK. What I find is that people > tend to use them for niche use cases (like the NVMe KV command set) or > obscure network routers. Even though the claim they both make is to > get the kernel out of the way and do stuff "way faster" the difficulty > they create by bypassing everything is quite a high burden. [..] > What all of the prior pass through's taught us is that if the use case > is big enough it will get pulled into the kernel and the kernel will > usually manage it better (DB users). If it remains a niche use case it > will likely remain out of the kernel, but we won't be hurt by it (NVME > KV protocol) and sometimes it doesn't really matter and the device > manufacturers will sort it out on their own (USB tokens). I don't see it as being linked to big enough use case at all. The kernel gets involved if there are good technical reasons to do so. Databases running over real filesystems with O_DIRECT is really technically better than raw block devices. While DPDK shows the opposite, userspace is the technically better option. This is now shown at scale. DPDK is not some niche. A big chunk of internet traffic is going through DPDKs, especially for mobile. Many ORAN solutions include DPDK on Linux. What has been improved kernel-side is the intergation. DPDK deployments now often use RDMA raw queue pairs instead of VFIO, which laregly eliminates the "high burden". There are many other cases, like DPDK, where the right answer is to reduce the kernel involvement. It is not so simple that things always get pulled into the kernel. Jason