On Mon, Jun 24, 2024 at 8:14 AM Will Deacon <will@xxxxxxxxxx> wrote: > > On Thu, May 23, 2024 at 10:52:21AM -0700, Rob Clark wrote: > > From: Rob Clark <robdclark@xxxxxxxxxxxx> > > > > Add an io-pgtable method to walk the pgtable returning the raw PTEs that > > would be traversed for a given iova access. > > > > Signed-off-by: Rob Clark <robdclark@xxxxxxxxxxxx> > > --- > > drivers/iommu/io-pgtable-arm.c | 51 ++++++++++++++++++++++++++++------ > > include/linux/io-pgtable.h | 4 +++ > > 2 files changed, 46 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c > > index f7828a7aad41..f47a0e64bb35 100644 > > --- a/drivers/iommu/io-pgtable-arm.c > > +++ b/drivers/iommu/io-pgtable-arm.c > > @@ -693,17 +693,19 @@ static size_t arm_lpae_unmap_pages(struct io_pgtable_ops *ops, unsigned long iov > > data->start_level, ptep); > > } > > > > -static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops, > > - unsigned long iova) > > +static int arm_lpae_pgtable_walk(struct io_pgtable_ops *ops, unsigned long iova, > > + int (*cb)(void *cb_data, void *pte, int level), > > + void *cb_data) > > { > > struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); > > arm_lpae_iopte pte, *ptep = data->pgd; > > int lvl = data->start_level; > > + int ret; > > > > do { > > /* Valid IOPTE pointer? */ > > if (!ptep) > > - return 0; > > + return -EFAULT; > > nit: -ENOENT might be a little better, as we're only checking against a > NULL entry rather than strictly any faulting entry. > > > /* Grab the IOPTE we're interested in */ > > ptep += ARM_LPAE_LVL_IDX(iova, lvl, data); > > @@ -711,22 +713,52 @@ static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops, > > > > /* Valid entry? */ > > if (!pte) > > - return 0; > > + return -EFAULT; > > Same here (and at the end of the function). > > > + > > + ret = cb(cb_data, &pte, lvl); > > Since pte is on the stack, rather than pointing into the actual pgtable, > I think it would be clearer to pass it by value to the callback. fwiw, I passed it as a void* to avoid the pte size.. although I guess it could be a union of all the possible pte types BR, -R > > > + if (ret) > > + return ret; > > > > - /* Leaf entry? */ > > + /* Leaf entry? If so, we've found the translation */ > > if (iopte_leaf(pte, lvl, data->iop.fmt)) > > - goto found_translation; > > + return 0; > > > > /* Take it to the next level */ > > ptep = iopte_deref(pte, data); > > } while (++lvl < ARM_LPAE_MAX_LEVELS); > > > > /* Ran out of page tables to walk */ > > + return -EFAULT; > > +} > > + > > +struct iova_to_phys_walk_data { > > + arm_lpae_iopte pte; > > + int level; > > +}; > > Expanding a little on Robin's suggestion, why don't we drop this structure > in favour of something more generic: > > struct arm_lpae_walk_data { > arm_lpae_iopte ptes[ARM_LPAE_MAX_LEVELS]; > }; > > and then do something in the walker like: > > if (cb && !cb(pte, lvl)) > walk_data->ptes[lvl] = pte; > > which could return the physical address at the end, if it reaches a leaf > entry. That way arm_lpae_iova_to_phys() is just passing a NULL callback > to the walker and your debug callback just needs to return 0 (i.e. the > callback is basically just saying whether or not to continue the walk). > > Will