Re: [PATCH v6 6/6] iommu/tegra241-cmdqv: Limit CMDs for guest owned VINTF

Nicolin Chen <nicolinc@xxxxxxxxxx> · Tue, 30 Apr 2024 11:58:44 -0700

On Tue, Apr 30, 2024 at 02:06:55PM -0300, Jason Gunthorpe wrote:
> On Mon, Apr 29, 2024 at 09:43:49PM -0700, Nicolin Chen wrote:
> > -struct arm_smmu_cmdq *tegra241_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > +static bool tegra241_vintf_support_cmds(struct tegra241_vintf *vintf,
> > +					u64 *cmds, int n)
> > +{
> > +	int i;
> > +
> > +	/* VINTF owned by hypervisor can execute any command */
> > +	if (vintf->hyp_own)
> > +		return true;
> > +
> > +	/* Guest-owned VINTF must Check against the list of supported CMDs */
> > +	for (i = 0; i < n; i++) {
> > +		switch (FIELD_GET(CMDQ_0_OP, cmds[i * CMDQ_ENT_DWORDS])) {
> > +		case CMDQ_OP_TLBI_NH_ASID:
> > +		case CMDQ_OP_TLBI_NH_VA:
> > +		case CMDQ_OP_ATC_INV:
> 
> So CMDQ only works if not ARM_SMMU_FEAT_E2H? Probably worth mentioning
> that too along with the discussion about HYP

Nod. EL2/EL3 commands aren't supported. And they aren't supposed
to be issued by a guess either, since ARM64_HAS_VIRT_HOST_EXTN is
the feature of "Virtualization Host Extensions"?

> 
> > +			continue;
> > +		default:
> > +			return false;
> > +		}
> > +	}
> > +
> > +	return true;
> > +}
> 
> For a performance path this looping seems disappointing.. The callers
> don't actually mix different command type. Is there something
> preventing adding a parameter at the callers?

The callers don't seem to mix at this moment. Yet we would have
to be extra careful against any future SMMU patch that may mix
commands?

> Actually looking at this more closely, isn't the command q selection
> in the wrong place?
> 
> Ie this batch stuff:
> 
> static void arm_smmu_cmdq_batch_add(struct arm_smmu_device *smmu,
> 				    struct arm_smmu_cmdq_batch *cmds,
> 				    struct arm_smmu_cmdq_ent *cmd)
> {
> 	int index;
> 
> 	if (cmds->num == CMDQ_BATCH_ENTRIES - 1 &&
> 	    (smmu->options & ARM_SMMU_OPT_CMDQ_FORCE_SYNC)) {
> 		arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmds, cmds->num, true);
> 		cmds->num = 0;
> 	}
> 
> 	if (cmds->num == CMDQ_BATCH_ENTRIES) {
> 		arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmds, cmds->num, false);
> 		cmds->num = 0;
> 	}
> 
> 	index = cmds->num * CMDQ_ENT_DWORDS;
> 	if (unlikely(arm_smmu_cmdq_build_cmd(&cmds->cmds[index], cmd))) {
> 		dev_warn(smmu->dev, "ignoring unknown CMDQ opcode 0x%x\n",
> 			 cmd->opcode);
> 		return;
> 	}
> 
> Has to push everything, across all the iterations of add/submut, onto
> the same CMDQ otherwise the SYNC won't be properly flushing?

ECMDQ seems to have such a limitation, but VCMDQs can get away
as HW can insert a SYNC to a queue that doesn't end with a SYNC.

> But each arm_smmu_cmdq_issue_cmdlist() calls its own get q
> function. Yes, they probably return the same Q since we are probably
> on the same CPU, but it seems logically wrong (and slower!) to
> organize it like this.
> 
> I would expect the Q to be selected when the struct
> arm_smmu_cmdq_batch is allocated on the stack, and be the same for the
> entire batch operation. Not only do we spend less time trying to
> compute the Q to use we have a built in guarentee that every command
> will be on the same Q as the fenching SYNC.

This seems to be helpful to ECMDQ. The current version disables
the preempts, which feels costly to me.

> Something sort of like this as another patch?
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 268da20baa4e9c..d8c9597878315a 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -357,11 +357,22 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
>  	return 0;
>  }
>  
> -static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu,
> -					       u64 *cmds, int n)
> +enum required_cmds {
> +	CMDS_ALL,
> +	/*
> +	 * Commands will be one of:
> +	 *  CMDQ_OP_ATC_INV, CMDQ_OP_TLBI_EL2_VA, CMDQ_OP_TLBI_NH_VA,
> +	 *  CMDQ_OP_TLBI_EL2_ASID, CMDQ_OP_TLBI_NH_ASID, CMDQ_OP_TLBI_S2_IPA,
> +	 *  CMDQ_OP_TLBI_S12_VMALL, CMDQ_OP_SYNC
> +	 */
> +	CMDS_INVALIDATION,
> +};

Hmm, guest-owned VCMDQs don't support EL2 commands. So, it feels
to be somehow complicated to decouple them further in the callers
of arm_smmu_cmdq_batch_add(). And I am not sure if there is a use
case of guest issuing CMDQ_OP_TLBI_S2_IPA/CMDQ_OP_TLBI_S12_VMALL
either, HW surprisingly supports these two though.

Perhaps we could just scan the first command in the batch, giving
a faith that no one will covertly sneak different commands in it?

Otherwise, there has to be a get_suported_cmdq callback so batch
or its callers can avoid adding unsupported commands at the first
place.

Thanks
Nicolin