On Tue, Apr 30, 2024 at 02:06:55PM -0300, Jason Gunthorpe wrote: > On Mon, Apr 29, 2024 at 09:43:49PM -0700, Nicolin Chen wrote: > > -struct arm_smmu_cmdq *tegra241_cmdqv_get_cmdq(struct arm_smmu_device *smmu) > > +static bool tegra241_vintf_support_cmds(struct tegra241_vintf *vintf, > > + u64 *cmds, int n) > > +{ > > + int i; > > + > > + /* VINTF owned by hypervisor can execute any command */ > > + if (vintf->hyp_own) > > + return true; > > + > > + /* Guest-owned VINTF must Check against the list of supported CMDs */ > > + for (i = 0; i < n; i++) { > > + switch (FIELD_GET(CMDQ_0_OP, cmds[i * CMDQ_ENT_DWORDS])) { > > + case CMDQ_OP_TLBI_NH_ASID: > > + case CMDQ_OP_TLBI_NH_VA: > > + case CMDQ_OP_ATC_INV: > > So CMDQ only works if not ARM_SMMU_FEAT_E2H? Probably worth mentioning > that too along with the discussion about HYP Nod. EL2/EL3 commands aren't supported. And they aren't supposed to be issued by a guess either, since ARM64_HAS_VIRT_HOST_EXTN is the feature of "Virtualization Host Extensions"? > > > + continue; > > + default: > > + return false; > > + } > > + } > > + > > + return true; > > +} > > For a performance path this looping seems disappointing.. The callers > don't actually mix different command type. Is there something > preventing adding a parameter at the callers? The callers don't seem to mix at this moment. Yet we would have to be extra careful against any future SMMU patch that may mix commands? > Actually looking at this more closely, isn't the command q selection > in the wrong place? > > Ie this batch stuff: > > static void arm_smmu_cmdq_batch_add(struct arm_smmu_device *smmu, > struct arm_smmu_cmdq_batch *cmds, > struct arm_smmu_cmdq_ent *cmd) > { > int index; > > if (cmds->num == CMDQ_BATCH_ENTRIES - 1 && > (smmu->options & ARM_SMMU_OPT_CMDQ_FORCE_SYNC)) { > arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmds, cmds->num, true); > cmds->num = 0; > } > > if (cmds->num == CMDQ_BATCH_ENTRIES) { > arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmds, cmds->num, false); > cmds->num = 0; > } > > index = cmds->num * CMDQ_ENT_DWORDS; > if (unlikely(arm_smmu_cmdq_build_cmd(&cmds->cmds[index], cmd))) { > dev_warn(smmu->dev, "ignoring unknown CMDQ opcode 0x%x\n", > cmd->opcode); > return; > } > > Has to push everything, across all the iterations of add/submut, onto > the same CMDQ otherwise the SYNC won't be properly flushing? ECMDQ seems to have such a limitation, but VCMDQs can get away as HW can insert a SYNC to a queue that doesn't end with a SYNC. > But each arm_smmu_cmdq_issue_cmdlist() calls its own get q > function. Yes, they probably return the same Q since we are probably > on the same CPU, but it seems logically wrong (and slower!) to > organize it like this. > > I would expect the Q to be selected when the struct > arm_smmu_cmdq_batch is allocated on the stack, and be the same for the > entire batch operation. Not only do we spend less time trying to > compute the Q to use we have a built in guarentee that every command > will be on the same Q as the fenching SYNC. This seems to be helpful to ECMDQ. The current version disables the preempts, which feels costly to me. > Something sort of like this as another patch? > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > index 268da20baa4e9c..d8c9597878315a 100644 > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > @@ -357,11 +357,22 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) > return 0; > } > > -static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu, > - u64 *cmds, int n) > +enum required_cmds { > + CMDS_ALL, > + /* > + * Commands will be one of: > + * CMDQ_OP_ATC_INV, CMDQ_OP_TLBI_EL2_VA, CMDQ_OP_TLBI_NH_VA, > + * CMDQ_OP_TLBI_EL2_ASID, CMDQ_OP_TLBI_NH_ASID, CMDQ_OP_TLBI_S2_IPA, > + * CMDQ_OP_TLBI_S12_VMALL, CMDQ_OP_SYNC > + */ > + CMDS_INVALIDATION, > +}; Hmm, guest-owned VCMDQs don't support EL2 commands. So, it feels to be somehow complicated to decouple them further in the callers of arm_smmu_cmdq_batch_add(). And I am not sure if there is a use case of guest issuing CMDQ_OP_TLBI_S2_IPA/CMDQ_OP_TLBI_S12_VMALL either, HW surprisingly supports these two though. Perhaps we could just scan the first command in the batch, giving a faith that no one will covertly sneak different commands in it? Otherwise, there has to be a get_suported_cmdq callback so batch or its callers can avoid adding unsupported commands at the first place. Thanks Nicolin