On Mon, Dec 23, 2024 at 6:03 AM Jason Wang <jasowang@xxxxxxxxxx> wrote: > > On Sat, Dec 21, 2024 at 9:44 PM Stefan Hajnoczi <stefanha@xxxxxxxxx> wrote: > > > > On Fri, 20 Dec 2024 at 10:56, Timos Ampelikiotis > > <t.ampelikiotis@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > > > > > > > > > > > On Wed, Dec 4, 2024 at 7:18 PM Stefan Hajnoczi <stefanha@xxxxxxxxx> wrote: > > >> > > >> On Wed, 4 Dec 2024 at 10:38, <t.ampelikiotis@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > >> > > > >> > From: Timos Ampelikiotis <t.ampelikiotis@xxxxxxxxxxxxxxxxxxxxxx> > > >> > > > >> > This commit, is based on virtio MMIO driver, adds support > > >> > for dynamic allocated (platform) virtio devices. This > > >> > allows applications running in native environments to use > > >> > virtio drivers as a HAL and eventually communicate with > > >> > user-space drivers (implementing the vhost-user protocol). > > >> > [...] > > > Security wise this approach has to fulfill the following two > > > requirements: > > > > > > a) The first one is that we need to check efficiently if > > > the calling process (vhost-user device) can have access to > > > the requested pages. For example, assuming that we run a virtio > > > blk on top of virtio-loopback transport the vhost-user blk > > > should be able to access only pages which are related with > > > virtio-blk virtqueues and vrings. That challenge is solvable > > > and we can add those checks without changing the architecture. > > > > Once pages have been faulted in, how does the kernel revoke access > > when the request is complete? I'm thinking of the scenario where a > > page is used for virtio DMA but then later reused for other purposes. > > Userspace must not retain access once the page is reused for non-DMA > > purposes. > > I guess it requires madvise(MADV_DONTNEED) or something similar. > Yes indeed, maybe "zap_vma_ptes" could be used when "vduse_domain_unmap_bounce_page" is called (for zero-copy cases). In this way, the driver itself would remove pages from the application to be reused for non-DMA purposes and since the application has a page_fault handler, it can obtain access on that page again if it is needed. > > > > > > b) The second one is the fact that the current memory mapping > > > approach cannot guarantee safety for the remaining data on a page > > > when the virtio buffers do not cover the whole size of it. > > > In order to address that security issue, we should implement > > > an approach such the one proposed by Jason Wang below: > > > https://github.com/jasowang/net/tree/vduse-zerocopy > > > > > > In this way we utilize the bouncing buffer idea and we can > > > guarantee to not expose the remaining data when buffer_size < > > > PAGE_SIZE and insert the page into the process otherwise. > > > > > > Overall, we are aware of those two security points, the first > > > one will require the implementation of additional checks and > > > the second the modification of data sharing model to avoid > > > exposing kernel data. > > > > > > If you think that the overall approach is interesting we > > > continue the discussion and target any future work to address > > > the above described challenges. > > > > With regards to the overall approach, please work with Jason since > > there is an overlap with VDUSE. If you guys agree on how VDUSE and > > loopback fit together, then I'm happy. > > > Right, I think it would be better if we can find a way in VDUSE first > as a lot of works/codes could be reused. While investigating your newest zero-copy PoC on vDUSE with bouncing buffer I integrated the very same mechanism (iova_domain.c) in virtio-loopback and tested it with vhost-user-blk and vhost-user-can devices. In my opinion this combination has the following benefits: a) This implementation addresses the majority of security concerns related to sharing page which might contain other kernel data with the user-space. b) The new implementation of virtio-loopback adapter can leverage the bouncing buffer and support vhost-user-blk and CAN. Support for devices like vhost-user-sound, gpio, rng, input, console should be in place with no effort but there are not tested yet. c) By making loopback work with bouncing buffer and not introducing changes to iova_domain.c we can consider that virtio-loopback is a shorter path to support the above referenced vhost-user devs. Links to the new virtio-loopback implementation: - https://gerrit.automotivelinux.org/gerrit/gitweb?p=src/virtio/virtio-loopback-adapter.git;a=commit;h=269f019fde391bffdfbd42dee45c0cc8721e8f4f - https://gerrit.automotivelinux.org/gerrit/gitweb?p=src/virtio/virtio-loopback-driver.git;a=commit;h=26138fa91896b50748b3c17429d707ababf41cad Last point related to the vDUSE zc PoC, to fix the "madvice" issue and improve performance, I think it is worth to consider: 1) utilizing "zap_vma_ptes" in order to avoid madvice and be able to remove pages from user-space when are not used anymore by the corresponding virtio-driver. 2) extending the zero-copy mechanism for more than "DMA_TO_DEVICE" cases Thanks, Timos