> -----Original Message----- > From: Shameerali Kolothum Thodi > Sent: Wednesday, April 17, 2024 9:01 AM > To: 'Jason Gunthorpe' <jgg@xxxxxxxxxx>; Nicolin Chen <nicolinc@xxxxxxxxxx> > Cc: will@xxxxxxxxxx; robin.murphy@xxxxxxx; joro@xxxxxxxxxx; > thierry.reding@xxxxxxxxx; vdumpa@xxxxxxxxxx; jonathanh@xxxxxxxxxx; linux- > kernel@xxxxxxxxxxxxxxx; iommu@xxxxxxxxxxxxxxx; linux-arm- > kernel@xxxxxxxxxxxxxxxxxxx; linux-tegra@xxxxxxxxxxxxxxx; Jerry Snitselaar > <jsnitsel@xxxxxxxxxx> > Subject: RE: [PATCH v5 0/6] Add Tegra241 (Grace) CMDQV Support (part 1/2) > > > > > -----Original Message----- > > From: Jason Gunthorpe <jgg@xxxxxxxxxx> > > Sent: Monday, April 15, 2024 6:14 PM > > To: Nicolin Chen <nicolinc@xxxxxxxxxx> > > Cc: will@xxxxxxxxxx; robin.murphy@xxxxxxx; joro@xxxxxxxxxx; > > thierry.reding@xxxxxxxxx; vdumpa@xxxxxxxxxx; jonathanh@xxxxxxxxxx; > > linux-kernel@xxxxxxxxxxxxxxx; iommu@xxxxxxxxxxxxxxx; linux-arm- > > kernel@xxxxxxxxxxxxxxxxxxx; linux-tegra@xxxxxxxxxxxxxxx; Jerry Snitselaar > > <jsnitsel@xxxxxxxxxx> > > Subject: Re: [PATCH v5 0/6] Add Tegra241 (Grace) CMDQV Support (part 1/2) > > > > On Fri, Apr 12, 2024 at 08:43:48PM -0700, Nicolin Chen wrote: > > > > > The user-space support is to provide uAPIs (via IOMMUFD) for hypervisors > > > in user space to passthrough VCMDQs to VMs, allowing these VMs to > > access > > > the VCMDQs directly without trappings, i.e. no VM Exits. This gives huge > > > performance improvements: 70% to 90% reductions of TLB invalidation > > time > > > were measured by various DMA unmap tests running in a guest OS, > > compared > > > to a nested SMMU CMDQ (with trappings). > > > > So everyone is on the same page, this is the primary point of this > > series. The huge speed up of in-VM performance is necessary for the > > workloads this chip is expected to be running. This series is unique > > from all the rest because it runs inside a VM, often in the from of a > > distro release. > > > > It doesn't need the other series or it's own part 2 as it entirely > > stands alone on bare metal hardware or on top of commercial VM cloud > > instances runing who-knows-what in their hypervisors. > > > > The other parts are substantially about enabling qemu and the open > > ecosystem to have fully functional vSMMU3 virtualization. > > Hi, > > We do have plans to revive the SMMUv3 ECMDQ series posted a while back[0] > and looking at this series, I am just wondering whether it makes sense to have > a similar one with ECMDQ as well? I see that the NVIDIA VCMDQ has a special > bit > to restrict the commands that can be issued from user space. If we end up > assigning > a ECMDQ to user space, is there any potential risk in doing so? > > SMMUV3 spec does say, > "Arm expects that the Non-secure Stream table, Command queue, Event queue > and > PRI queue are controlled by the most privileged Non-secure system software. " > > Not clear to me what are the major concerns here and maybe we can come up > with > something to address that in kernel. Just to add to that. One idea could be like to have a case where when ECMDQs are detected, use that for issuing limited set of cmds(like stage 1 TLBIs) and use the normal cmdq for rest. Since we use stage 1 for both host and for Guest nested cases and TLBIs are the bottlenecks in most cases I think this should give performance benefits. Thanks, Shameer