On Fri, Apr 19, 2024 at 5:40 AM Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > On Thu, Apr 18, 2024 at 09:32:23AM -0700, Tomasz Jeznach wrote: > > @@ -31,13 +32,350 @@ MODULE_LICENSE("GPL"); > > /* Timeouts in [us] */ > > #define RISCV_IOMMU_DDTP_TIMEOUT 50000 > > > > -static int riscv_iommu_attach_identity_domain(struct iommu_domain *domain, > > - struct device *dev) > > +/* RISC-V IOMMU PPN <> PHYS address conversions, PHYS <=> PPN[53:10] */ > > +#define phys_to_ppn(va) (((va) >> 2) & (((1ULL << 44) - 1) << 10)) > > +#define ppn_to_phys(pn) (((pn) << 2) & (((1ULL << 44) - 1) << 12)) > > + > > +#define dev_to_iommu(dev) \ > > + container_of((dev)->iommu->iommu_dev, struct riscv_iommu_device, iommu) > > We have iommu_get_iommu_dev() now > > > +static unsigned long riscv_iommu_get_pages(struct riscv_iommu_device *iommu, unsigned int order) > > +{ > > + struct riscv_iommu_devres *devres; > > + struct page *pages; > > + > > + pages = alloc_pages_node(dev_to_node(iommu->dev), > > + GFP_KERNEL_ACCOUNT | __GFP_ZERO, order); > > + if (unlikely(!pages)) { > > + dev_err(iommu->dev, "Page allocation failed, order %u\n", order); > > + return 0; > > + } > > This needs adjusting for the recently merged allocation accounting > > > +static int riscv_iommu_attach_domain(struct riscv_iommu_device *iommu, > > + struct device *dev, > > + struct iommu_domain *iommu_domain) > > +{ > > + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); > > + struct riscv_iommu_dc *dc; > > + u64 fsc, ta, tc; > > + int i; > > + > > + if (!iommu_domain) { > > + ta = 0; > > + tc = 0; > > + fsc = 0; > > + } else if (iommu_domain->type == IOMMU_DOMAIN_IDENTITY) { > > + ta = 0; > > + tc = RISCV_IOMMU_DC_TC_V; > > + fsc = FIELD_PREP(RISCV_IOMMU_DC_FSC_MODE, RISCV_IOMMU_DC_FSC_MODE_BARE); > > + } else { > > + /* This should never happen. */ > > + return -ENODEV; > > + } > > Please don't write it like this. This function is already being called > by functions that are already under specific ops, don't check > domain->type here. > > Instead have the caller compute and pass in the ta/tc/fsc > values. Maybe in a tidy struct.. > > > + /* Update existing or allocate new entries in device directory */ > > + for (i = 0; i < fwspec->num_ids; i++) { > > + dc = riscv_iommu_get_dc(iommu, fwspec->ids[i], !iommu_domain); > > + if (!dc && !iommu_domain) > > + continue; > > + if (!dc) > > + return -ENODEV; > > But if this fails some of the fwspecs were left in a weird state ? > > Drivers should try hard to have attach functions that fail and make no > change at all or fully succeed. > > Meaning ideally preallocate any required memory before doing any > change to the HW visable structures. > Good point. Done. Looking at the fwspec->ids[] I'm assuming nobody will add/modify the IDs after iommu_probe_device() completes. > > + > > + /* Swap device context, update TC valid bit as the last operation */ > > + xchg64(&dc->fsc, fsc); > > + xchg64(&dc->ta, ta); > > + xchg64(&dc->tc, tc); > > This doesn't loook right? When you get to adding PAGING suport fsc has > the page table pfn and ta has the cache tag, so this will end up > tearing the data for sure, eg when asked to replace a PAGING domain > with another PAGING domain? That will create a functional/security > problem, right? > > I would encourage you to re-use the ARM sequencing code, ideally moved > to some generic helper library. Every iommu driver dealing with > multi-quanta descriptors seems to have this same fundamental > sequencing problem. > Good point. Reworked. > > +static void riscv_iommu_release_device(struct device *dev) > > +{ > > + struct riscv_iommu_device *iommu = dev_to_iommu(dev); > > + > > + riscv_iommu_attach_domain(iommu, dev, NULL); > > +} > > The release_domain has landed too now. Please don't invent weird NULL > domain types that have special meaning. I assume clearing the V bit is > a blocking behavior? So please implement a proper blocking domain and > set release_domain = &riscv_iommu_blocking and just omit this release > function. > Updated to use release_domain, should be cleaner now. Clearing TC.V is a blocking (but noisy) behavior, should be fine for release domain where devices should be quiesced already. > > @@ -133,12 +480,14 @@ int riscv_iommu_init(struct riscv_iommu_device *iommu) > > rc = riscv_iommu_init_check(iommu); > > if (rc) > > return dev_err_probe(iommu->dev, rc, "unexpected device state\n"); > > - /* > > - * Placeholder for a complete IOMMU device initialization. > > - * For now, only bare minimum: enable global identity mapping mode and register sysfs. > > - */ > > - riscv_iommu_writeq(iommu, RISCV_IOMMU_REG_DDTP, > > - FIELD_PREP(RISCV_IOMMU_DDTP_MODE, RISCV_IOMMU_DDTP_MODE_BARE)); > > + > > + rc = riscv_iommu_ddt_alloc(iommu); > > + if (WARN(rc, "cannot allocate device directory\n")) > > + goto err_init; > > memory allocation failure already makes noisy prints, more prints are > not needed.. > > > + rc = riscv_iommu_set_ddtp_mode(iommu, RISCV_IOMMU_DDTP_MODE_MAX); > > + if (WARN(rc, "cannot enable iommu device\n")) > > + goto err_init; > > This is not a proper use of WARN, it should only be used for things > that cannot happen not undesired error paths. > > Jason Thanks, ack to all. Will push updated v3 shortly. - Tomasz