> From: Nicolin Chen <nicolinc@xxxxxxxxxx> > Sent: Wednesday, May 29, 2024 4:23 AM > > On Mon, May 27, 2024 at 01:08:43AM +0000, Tian, Kevin wrote: > > > From: Jason Gunthorpe <jgg@xxxxxxxxxx> > > > Sent: Friday, May 24, 2024 9:19 PM > > > > > > On Fri, May 24, 2024 at 07:13:23AM +0000, Tian, Kevin wrote: > > > > I'm curious to learn the real reason of that design. Is it because you > > > > want to do certain load-balance between viommu's or due to other > > > > reasons in the kernel smmuv3 driver which e.g. cannot support a > > > > viommu spanning multiple pSMMU? > > > > > > Yeah, there is no concept of support for a SMMUv3 instance where it's > > > command Q's can only work on a subset of devices. > > > > > > My expectation was that VIOMMU would be 1:1 with physical iommu > > > instances, I think AMD needs this too?? > > > > > > > Yes this part is clear now regarding to VCMDQ. > > > > But Nicoline said: > > > > " > > One step back, even without VCMDQ feature, a multi-pSMMU setup > > will have multiple viommus (with our latest design) being added > > to a viommu list of a single vSMMU's. Yet, vSMMU in this case > > always traps regular SMMU CMDQ, so it can do viommu selection > > or even broadcast (if it has to). > > " > > > > I don't think there is an arch limitation mandating that? > > What I mean is for regular vSMMU. Without VCMDQ, a regular vSMMU > on a multi-pSMMU setup will look like (e.g. three devices behind > different SMMUs): > |<------ VMM ------->|<------ kernel ------>| > |-- viommu0 --|-- pSMMU0 --| > vSMMU--|-- viommu1 --|-- pSMMU1 --|--s2_hwpt > |-- viommu2 --|-- pSMMU2 --| > > And device would attach to: > |<---- guest ---->|<--- VMM --->|<- kernel ->| > |-- dev0 --|-- viommu0 --|-- pSMMU0 --| > vSMMU--|-- dev1 --|-- viommu1 --|-- pSMMU1 --| > |-- dev2 --|-- viommu2 --|-- pSMMU2 --| > > When trapping a device cache invalidation: it is straightforward > by deciphering the virtual device ID to pick the viommu that the > device is attached to. I understand how above works. My question is why that option is chosen instead of going with 1:1 mapping between vSMMU and viommu i.e. letting the kernel to figure out which pSMMU should be sent an invalidation cmd to, as how VT-d is virtualized. I want to know whether doing so is simply to be compatible with what VCMDQ requires, or due to another untold reason. > > When doing iotlb invalidation, a command may or may not contain > an ASID (a domain ID, and nested domain in this case): > a) if a command doesn't have an ASID, VMM needs to broadcast the > command to all viommus (i.e. pSMMUs) > b) if a command has an ASID, VMM needs to initially maintain an > S1 HWPT list by linking an ASID when adding an HWPT entry to > the list, by deciphering vSTE and its linked CD. Then it needs > to go through the S1 list with the ASID in the command, and to > find all corresponding HWPTs to issue/broadcast the command. > > Thanks > Nicolin