Hi Jörg, Thanks for your review, I'll fix the issues you pointed out and I left out. On Mon, Mar 02, 2020 at 04:36:06PM +0100, Joerg Roedel wrote: > On Thu, Feb 20, 2020 at 07:15:14PM +0100, Maxime Ripard wrote: > > +struct sun50i_iommu_domain { > > + struct iommu_domain domain; > > + > > + /* Number of devices attached to the domain */ > > + refcount_t refcnt; > > + > > + /* Lock to modify the Directory Table */ > > + spinlock_t dt_lock; > > I suggest you make page-table updates lock-less. Otherwise this lock > will become a bottle-neck when using the IOMMU through DMA-API. As far as I understand it, the page table can be accessed concurrently since the framework doesn't seem to provide any serialization / locking, shouldn't we have some locks to prevent concurrent access? > > + > > +static int sun50i_iommu_map(struct iommu_domain *domain, unsigned long iova, > > + phys_addr_t paddr, size_t size, int prot, gfp_t gfp) > > +{ > > + struct sun50i_iommu_domain *sun50i_domain = to_sun50i_domain(domain); > > + struct sun50i_iommu *iommu = sun50i_domain->iommu; > > + u32 pte_index; > > + u32 *page_table, *pte_addr; > > + unsigned long flags; > > + int ret = 0; > > + > > + spin_lock_irqsave(&sun50i_domain->dt_lock, flags); > > + page_table = sun50i_dte_get_page_table(sun50i_domain, iova, gfp); > > + if (IS_ERR(page_table)) { > > + ret = PTR_ERR(page_table); > > + goto out; > > + } > > + > > + pte_index = sun50i_iova_get_pte_index(iova); > > + pte_addr = &page_table[pte_index]; > > + if (sun50i_pte_is_page_valid(*pte_addr)) { > > You can use unlikely() here. > > > + phys_addr_t page_phys = sun50i_pte_get_page_address(*pte_addr); > > + dev_err(iommu->dev, > > + "iova %pad already mapped to %pa cannot remap to %pa prot: %#x\n", > > + &iova, &page_phys, &paddr, prot); > > + ret = -EBUSY; > > + goto out; > > + } > > + > > + *pte_addr = sun50i_mk_pte(paddr, prot); > > + sun50i_table_flush(sun50i_domain, pte_addr, 1); > > This maps only one page, right? But the function needs to map up to > 'size' as given in the parameter list. It does, but pgsize_bitmap is set to 4k only (since the hardware only supports that), so we would have multiple calls to map, each time with a single page judging from: https://elixir.bootlin.com/linux/latest/source/drivers/iommu/iommu.c#L1948 Right? > > + > > + spin_lock_irqsave(&iommu->iommu_lock, flags); > > + sun50i_iommu_tlb_invalidate(iommu, iova); > > + spin_unlock_irqrestore(&iommu->iommu_lock, flags); > > Why is there a need to flush the TLB here? The IOMMU-API provides > call-backs so that the user of the API can decide when it wants > to flush the IO/TLB. Such flushes are usually expensive and doing them > on every map and unmap will cost significant performance. The vendor driver was doing something along those lines and I wanted to be conservative with the cache management if we didn't run into performances issues, but I'll convert to the iotlb callbacks then. Thanks! Maxime
Attachment:
signature.asc
Description: PGP signature