On Sun, Feb 15, 2009 at 07:20:52AM +0800, Grant Grundler wrote: > On Thu, Feb 12, 2009 at 08:50:38PM +0800, Yu Zhao wrote: > > Support device IOTLB (i.e. ATS) for both native and KVM environments. > > + > > +static void iommu_enable_dev_iotlb(struct device_domain_info *info) > > +{ > > + pci_enable_ats(info->dev, VTD_PAGE_SHIFT); > > +} > > Why is a static function defined that calls a global function? There would be some extra steps to do before VT-d enables ATS in the future, so this wrapper makes code expandable later. > > > + > > +static void iommu_disable_dev_iotlb(struct device_domain_info *info) > > +{ > > + if (info->dev && pci_ats_enabled(info->dev)) > > + pci_disable_ats(info->dev); > > +} > > ditto. pci_disable_ats() should be able to handle the case when > "info->dev" is NULL and will know if ATS is enabled. The info->dev could be NULL only because the VT-d code makes it so. AMD an IBM IOMMU may not have this requirement. If we make pci_disable_ats() accept NULL pci_dev, it would fail to catch some errors like using pci_disable_ats without calling pci_enable_ats before. > > I think both of these functions can be dropped and just directly call pci_*_ats(). > > > + > > +static void iommu_flush_dev_iotlb(struct dmar_domain *domain, > > + u64 addr, unsigned mask) > > +{ > > + int rc; > > + u16 sid, qdep; > > + unsigned long flags; > > + struct device_domain_info *info; > > + > > + spin_lock_irqsave(&device_domain_lock, flags); > > + list_for_each_entry(info, &domain->devices, link) { > > Would it be possible to define a single domain for each PCI device? > Or does "domain" represent an IOMMU? > Sorry, I forgot...I'm sure someone has mentioned this the past. A domain represents one translation mapping. For device used by the host, there is one domain per device. Device assigned to a guest shares one domain. > > I want to point out list_for_each_entry() is effectively a "nested loop". > iommu_flush_dev_iotlb() will get called alot from flush_unmaps(). > Perhaps do the lookup once there and pass that as a parameter? > I don't know if that is feasible. But if this is a very frequently > used code path, every CPU cycle counts. iommu_flush_dev_iotlb() is only used to flush the devices used in the host, which means there is always one entry on the list. > > > > + if (!info->dev || !pci_ats_enabled(info->dev)) > > + continue; > > + > > + sid = info->bus << 8 | info->devfn; > > + qdep = pci_ats_queue_depth(info->dev); > > re Matthew Wilcox's comment - looks like caching ats_queue_depth > is appropriate. Yes, it's cached as of v3. > > + rc = qi_flush_dev_iotlb(info->iommu, sid, qdep, addr, mask); > > + if (rc) > > + printk(KERN_ERR "IOMMU: flush device IOTLB failed\n"); > > Can this be a dev_printk please? Yes, will replace it with dev_err(). > Perhaps in general review the use of "printk" so when errors are reported, > users will know which devices might be affected by the failure. > If more than a few printk's should be "converted" to dev_printk(), I'd > be happy if that were a seperate patch (submitted later). > > > > pr_debug("Set context mapping for %02x:%02x.%d\n", > > bus, PCI_SLOT(devfn), PCI_FUNC(devfn)); > > @@ -1534,7 +1608,11 @@ static int domain_context_mapping_one(struct dmar_domain *domain, > > context_set_domain_id(context, id); > > context_set_address_width(context, iommu->agaw); > > context_set_address_root(context, virt_to_phys(pgd)); > > - context_set_translation_type(context, CONTEXT_TT_MULTI_LEVEL); > > + info = iommu_support_dev_iotlb(domain, bus, devfn); > > + if (info) > > + context_set_translation_type(context, CONTEXT_TT_DEV_IOTLB); > > + else > > + context_set_translation_type(context, CONTEXT_TT_MULTI_LEVEL); > > Would it be ok to rewrite this as: > + context_set_translation_type(context, > + info ? CONTEXT_TT_DEV_IOTLB : CONTEXT_TT_MULTI_LEVEL); Yes, this one looks better. > > > context_set_fault_enable(context); > > context_set_present(context); > > domain_flush_cache(domain, context, sizeof(*context)); > > @@ -1546,6 +1624,8 @@ static int domain_context_mapping_one(struct dmar_domain *domain, > > iommu_flush_write_buffer(iommu); > > else > > iommu->flush.flush_iotlb(iommu, 0, 0, 0, DMA_TLB_DSI_FLUSH, 0); > > Adding a blank line here would make this more readable. > (AFAIK, not required by coding style, just my opinion.) Yes, I prefer a bank line here too, somehow I missed it. > > > + if (info) > > + iommu_enable_dev_iotlb(info); > > Could iommu_enable_dev_iotlb() (or pci_enable_ats()) check if "info" is NULL? > Then this would just be a simple function call. > > And it would be consistent with usage of iommu_disable_dev_iotlb(). Yes, good idea. Thanks a lot for reviewing it! Yu -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html