On Tue, Jun 29, 2021 at 3:56 PM Liu, Xiaodong <xiaodong.liu@xxxxxxxxx> wrote: > > > > >-----Original Message----- > >From: Jason Wang <jasowang@xxxxxxxxxx> > >Sent: Tuesday, June 29, 2021 12:11 PM > >To: Liu, Xiaodong <xiaodong.liu@xxxxxxxxx>; Xie Yongji > ><xieyongji@xxxxxxxxxxxxx>; mst@xxxxxxxxxx; stefanha@xxxxxxxxxx; > >sgarzare@xxxxxxxxxx; parav@xxxxxxxxxx; hch@xxxxxxxxxxxxx; > >christian.brauner@xxxxxxxxxxxxx; rdunlap@xxxxxxxxxxxxx; willy@xxxxxxxxxxxxx; > >viro@xxxxxxxxxxxxxxxxxx; axboe@xxxxxxxxx; bcrl@xxxxxxxxx; corbet@xxxxxxx; > >mika.penttila@xxxxxxxxxxxx; dan.carpenter@xxxxxxxxxx; joro@xxxxxxxxxx; > >gregkh@xxxxxxxxxxxxxxxxxxx > >Cc: songmuchun@xxxxxxxxxxxxx; virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx; > >netdev@xxxxxxxxxxxxxxx; kvm@xxxxxxxxxxxxxxx; linux-fsdevel@xxxxxxxxxxxxxxx; > >iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx > >Subject: Re: [PATCH v8 00/10] Introduce VDUSE - vDPA Device in Userspace > > > > > >在 2021/6/28 下午1:54, Liu, Xiaodong 写道: > >>> Several issues: > >>> > >>> - VDUSE needs to limit the total size of the bounce buffers (64M if I was not > >>> wrong). Does it work for SPDK? > >> Yes, Jason. It is enough and works for SPDK. > >> Since it's a kind of bounce buffer mainly for in-flight IO, so limited size like > >> 64MB is enough. > > > > > >Ok. > > > > > >> > >>> - VDUSE can use hugepages but I'm not sure we can mandate hugepages (or > >we > >>> need introduce new flags for supporting this) > >> Same with your worry, I'm afraid too that it is a hard for a kernel module > >> to directly preallocate hugepage internal. > >> What I tried is that: > >> 1. A simple agent daemon (represents for one device) `preallocates` and maps > >> dozens of 2MB hugepages (like 64MB) for one device. > >> 2. The daemon passes its mapping addr&len and hugepage fd to kernel > >> module through created IOCTL. > >> 3. Kernel module remaps the hugepages inside kernel. > > > > > >Such model should work, but the main "issue" is that it introduce > >overheads in the case of vhost-vDPA. > > > >Note that in the case of vhost-vDPA, we don't use bounce buffer, the > >userspace pages were shared directly. > > > >And since DMA is not done per page, it prevents us from using tricks > >like vm_insert_page() in those cases. > > > > Yes, really, it's a problem to handle vhost-vDPA case. > But there are already several solutions to get VM served, like vhost-user, > vfio-user, so at least for SPDK, it won't serve VM through VDUSE. If a user > still want to do that, then the user should tolerate Introduced overhead. > > In other words, software backend like SPDK, will appreciate the virtio > datapath of VDUSE to serve local host instead of VM. That's why I also drafted > a "virtio-local" to bridge vhost-user target and local host kernel virtio-blk. > > > > >> 4. Vhost user target gets and maps hugepage fd from kernel module > >> in vhost-user msg through Unix Domain Socket cmsg. > >> Then kernel module and target map on the same hugepage based > >> bounce buffer for in-flight IO. > >> > >> If there is one option in VDUSE to map userspace preallocated memory, then > >> VDUSE should be able to mandate it even it is hugepage based. > >> > > > >As above, this requires some kind of re-design since VDUSE depends on > >the model of mmap(MAP_SHARED) instead of umem registering. > > Got it, Jason, this may be hard for current version of VDUSE. > Maybe we can consider these options after VDUSE merged later. > > Since if VDUSE datapath could be directly leveraged by vhost-user target, > its value will be propagated immediately. > Agreed! Thanks, Yongji