On Thu, Dec 19, 2019 at 05:30:30PM +0100, Jean-Philippe Brucker wrote: > The SMMU can support up to 20 bits of SSID. Add a second level of page > tables to accommodate this. Devices that support more than 1024 SSIDs now > have a table of 1024 L1 entries (8kB), pointing to tables of 1024 context > descriptors (64kB), allocated on demand. > > Tested-by: Zhangfei Gao <zhangfei.gao@xxxxxxxxxx> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@xxxxxxxxxx> > --- > drivers/iommu/arm-smmu-v3.c | 154 +++++++++++++++++++++++++++++++++--- > 1 file changed, 144 insertions(+), 10 deletions(-) > > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c > index b825a5639afc..bf106a7b53eb 100644 > --- a/drivers/iommu/arm-smmu-v3.c > +++ b/drivers/iommu/arm-smmu-v3.c > @@ -224,6 +224,7 @@ > > #define STRTAB_STE_0_S1FMT GENMASK_ULL(5, 4) > #define STRTAB_STE_0_S1FMT_LINEAR 0 > +#define STRTAB_STE_0_S1FMT_64K_L2 2 > #define STRTAB_STE_0_S1CTXPTR_MASK GENMASK_ULL(51, 6) > #define STRTAB_STE_0_S1CDMAX GENMASK_ULL(63, 59) > > @@ -263,7 +264,20 @@ > > #define STRTAB_STE_3_S2TTB_MASK GENMASK_ULL(51, 4) > > -/* Context descriptor (stage-1 only) */ > +/* > + * Context descriptors. > + * > + * Linear: when less than 1024 SSIDs are supported > + * 2lvl: at most 1024 L1 entries, > + * 1024 lazy entries per table. > + */ > +#define CTXDESC_SPLIT 10 > +#define CTXDESC_L2_ENTRIES (1 << CTXDESC_SPLIT) > + > +#define CTXDESC_L1_DESC_DWORDS 1 > +#define CTXDESC_L1_DESC_VALID 1 #define CTXDESC_L1_DESC_V (1UL << 0) fits better with the rest of the driver and also ensures that the thing is unsigned (we should probably switch over the BIT macros, but that's a separate cleanup patch). > +#define CTXDESC_L1_DESC_L2PTR_MASK GENMASK_ULL(51, 12) > + > #define CTXDESC_CD_DWORDS 8 > #define CTXDESC_CD_0_TCR_T0SZ GENMASK_ULL(5, 0) > #define ARM64_TCR_T0SZ GENMASK_ULL(5, 0) > @@ -575,7 +589,12 @@ struct arm_smmu_cd_table { > }; > > struct arm_smmu_s1_cfg { > - struct arm_smmu_cd_table table; > + /* Leaf tables or linear table */ > + struct arm_smmu_cd_table *tables; > + size_t num_tables; > + /* First level tables, when two levels are used */ > + __le64 *l1ptr; > + dma_addr_t l1ptr_dma; It probably feels like a nit, but I think this is all a little hard to read because it differs unnecessarily from the way the stream table is handled. Could we align the two as follows? (I've commented things with what they refer to in your patch): struct arm_smmu_l1_ctx_desc { // arm_smmu_cd_table __le64 *l2ptr; // ptr dma_addr_t l2ptr_dma; // ptr_dma }; struct arm_smmu_ctx_desc_cfg { __le64 *cdtab; // l1ptr dma_addr_t cdtab_dma; // l1ptr_dma struct arm_smmu_l1_ctx_desc *l1_desc; // tables unsigned int num_l1_ents; // num_tables }; struct arm_smmu_s1_cfg { struct arm_smmu_ctx_desc_cfg cdcfg; struct arm_smmu_ctx_desc cd; u8 s1fmt; u8 s1cdmax; }; I don't know whether you'd then want to move s1fmt and s1cdmax into the cdcfg, I'll leave that up to you. Similarly if you want any functions to operate on arm_smmu_ctx_desc_cfg in preference to arm_smmu_s1_cfg. Thoughts? Will