On Mon, Dec 23, 2024 at 5:59 AM Jason Wang <jasowang@xxxxxxxxxx> wrote: > > On Fri, Dec 20, 2024 at 11:49 PM Timos Ampelikiotis > <t.ampelikiotis@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > > > > > > > On Thu, Dec 5, 2024 at 10:17 AM Jason Wang <jasowang@xxxxxxxxxx> wrote: > >> > >> Adding Eugenio and YongJi. > >> > >> On Wed, Dec 4, 2024 at 11:38 PM <t.ampelikiotis@xxxxxxxxxxxxxxxxxxxxxx> wrote: > >> > > >> > From: Timos Ampelikiotis <t.ampelikiotis@xxxxxxxxxxxxxxxxxxxxxx> > >> > > >> > We would like to share with you an RFC for the Virtio-loopback > >> > technology which we have been working on at Virtual Open Systems in > >> > the context of the Automotive Grade Linux community (Software defined > >> > Vehicles expert group) > >> > > >> > We previously presented this activity (see [1]) and now we come back > >> > to you with latest development and updates. > >> > > >> > We believe that the technology is more mature today and we would like > >> > to assess the community interest in the technology itself and in > >> > merging the code. > >> > > >> > Below we provide a brief description of the technology, recent > >> > updates and a short comparison with vDUSE that might be seen as a > >> > similar technology. > >> > > >> > 1. Overview: > >> > ------- > >> > > >> > Virtio-loopback is a hardware abstraction layer (HAL) designed for > >> > non-virtualized environments based on virtio. The main objective is > >> > to enable applications communication with vhost-user devices in a > >> > non-virtualized environment. > >> > > >> > More in details, Virtio-loopback architecture consists of a new > >> > transport (Virtio-loopback), a user-space application (Adapter) and > >> > the vhost-user devices. > >> > > >> > The data path has been implemented using the "zero-copy" principle, > >> > >> This need more clarification, for example, how could we prevent > >> malicious usersapce device from modifying kernel pages etc. Especially > >> consider not all buffer occupy full pages. > >> > >> Actually, after chatting with Yong Ji, I've played with a zerocopy POC > >> for VDUSE, but it tries to do zerocopy only for page aligned buffer: > >> > >> https://github.com/jasowang/net/tree/vduse-zerocopy > >> > >> The idea is to map the page directly to the userspace if the buffer > >> occupies a full page and usersapce is expected to recycle the page via > >> MADV_DONTNEED. It's far from mature but it can demonstrate the idea > >> somehow. > > > > > > I had a quick look into the POC and it seems like it improves > > performance (by reducing the copies) without compromising the security. > > Note that it has something left: e.g what if userspace doesn't > "recycle" pages via MADV_DONTNEED etc. > > > > > In general, it is similar to what we propose (though we focused on > > performance mainly). A brief introduction of how data are shared with > > the vhost-user devices without copies follows: > > > > In virtio-loopback, a page handler is assigned to the kernel memory > > when a vhost-user device tries to mmap a new memory region. Any time > > then the vhost-user device tries to actually access a page inside > > that memory region, a page-fault is generated and the driver decides > > which is the corresponding page to be inserted to the vhost-user > > process. > > Note that the page fault is really expensive, that's in my POC I > hacked the VDUSE to map the pages to avoid #PF. But it needs a concept > like "owner" so VDUSE can grab the mm_struct of the owner process to > do that. > > > Before sharing the page, we have intentionally left some > > space for security checks to be implemented in the future. > > At that point we can check if the page requested is related to the > > corresponding device and if not then do not share it with the user-space. > > > > As correctly said, that solution does not address the case of buffers > > being smaller than a page, and indeed, there is the chance for malicious > > applications to take advantage of the data padding the page after the buffer. > > > > Solving that issue would lead us to the solution you shared above. More > > specifically creating a bouncing buffer approach for those cases. I didn't > > have the chance to benchmark the above POC yet, but I believe that it should > > introduce a performance advantage over the upstream vDUSE solution. > > I had some tests through qsd. I can get at most 40% improvement when > I'm using 128K write etc. > > But when I was doing POC with OVS-DPDK, I didn't get good performance > as it involved too much madvise() per packet. I'm trying to seek a > good way to reduce the call to madvise. > > > > > I understand that security is a very important point here. > > So my first objectives for the virtio-loopback future development > > would be first to address all the security concerns about the > > technology. > > > > Before that, I would like to assess your interest in the technology > > and understand: > > a) if the community would be interested to merge a new virtio-transport > > which does something similar with vDUSE but does not depend on vDPA? > > If there's an advantage of bypassing vDPA, I would like to know. Since > I'm seeking a way to accelerate VDUSE with zerocopy and the work was > kind of duplicated. Performance wise, it might not introduce any additional benefit but since it is a simpler, more lightweight architecture with less components and dependencies might be also interesting to co-exist with vduse for serving as virtio HAL for non-virtualization environments. > > > b) if it would be interesting to make the adapter compatible with vDUSE > > in order to add support for more devices in the current vDUSE > > implementation? > > That would be welcomed. Nice! I look forward to see the vduse-agent when it is available. In the meantime feel free to have a look into the virtio-loopback adapter (user-space counterpart) available here: https://gerrit.automotivelinux.org/gerrit/gitweb?p=src/virtio/virtio-loopback-adapter.git;a=commit;h=269f019fde391bffdfbd42dee45c0cc8721e8f4f Thanks, Timos