Tuesday, January 21, 2020 10:35 AM, Jason Wang: > Subject: Re: [PATCH 3/5] vDPA: introduce vDPA bus > > > On 2020/1/21 下午4:15, Michael S. Tsirkin wrote: > > On Tue, Jan 21, 2020 at 04:00:38PM +0800, Jason Wang wrote: > >> On 2020/1/21 下午1:47, Michael S. Tsirkin wrote: > >>> On Tue, Jan 21, 2020 at 12:00:57PM +0800, Jason Wang wrote: > >>>> On 2020/1/21 上午1:49, Jason Gunthorpe wrote: > >>>>> On Mon, Jan 20, 2020 at 04:43:53PM +0800, Jason Wang wrote: > >>>>>> This is similar to the design of platform IOMMU part of > >>>>>> vhost-vdpa. We decide to send diffs to platform IOMMU there. If > >>>>>> it's ok to do that in driver, we can replace set_map with incremental > API like map()/unmap(). > >>>>>> > >>>>>> Then driver need to maintain rbtree itself. > >>>>> I think we really need to see two modes, one where there is a > >>>>> fixed translation without dynamic vIOMMU driven changes and one > >>>>> that supports vIOMMU. > >>>> I think in this case, you meant the method proposed by Shahaf that > >>>> sends diffs of "fixed translation" to device? > >>>> > >>>> It would be kind of tricky to deal with the following case for example: > >>>> > >>>> old map [4G, 16G) new map [4G, 8G) > >>>> > >>>> If we do > >>>> > >>>> 1) flush [4G, 16G) > >>>> 2) add [4G, 8G) > >>>> > >>>> There could be a window between 1) and 2). > >>>> > >>>> It requires the IOMMU that can do > >>>> > >>>> 1) remove [8G, 16G) > >>>> 2) flush [8G, 16G) > >>>> 3) change [4G, 8G) > >>>> > >>>> .... > >>> Basically what I had in mind is something like qemu memory api > >>> > >>> 0. begin > >>> 1. remove [8G, 16G) > >>> 2. add [4G, 8G) > >>> 3. commit > >> > >> This sounds more flexible e.g driver may choose to implement static > >> mapping one through commit. But a question here, it looks to me this > >> still requires the DMA to be synced with at least commit here. > >> Otherwise device may get DMA fault? Or device is expected to be paused > DMA during begin? > >> > >> Thanks > > For example, commit might switch one set of tables for another, > > without need to pause DMA. > > > Yes, I think that works but need confirmation from Shahaf or Jason. >From my side, as I wrote, I would like to see the suggested function prototype along w/ the definition of the expectation from driver upon calling those. It is not 100% clear to me what should be the outcome of remove/flush/change/commit > > Thanks > > > > > > >>> Anyway, I'm fine with a one-shot API for now, we can improve it > >>> later. > >>> > >>>>> There are different optimization goals in the drivers for these > >>>>> two configurations. > >>>>> > >>>>>>> If the first one, then I think memory hotplug is a heavy flow > >>>>>>> regardless. Do you think the extra cycles for the tree traverse > >>>>>>> will be visible in any way? > >>>>>> I think if the driver can pause the DMA during the time for > >>>>>> setting up new mapping, it should be fine. > >>>>> This is very tricky for any driver if the mapping change hits the > >>>>> virtio rings. :( > >>>>> > >>>>> Even a IOMMU using driver is going to have problems with that.. > >>>>> > >>>>> Jason > >>>> Or I wonder whether ATS/PRI can help here. E.g during I/O page > >>>> fault, driver/device can wait for the new mapping to be set and > >>>> then replay the DMA. > >>>> > >>>> Thanks > >>>>