On Wed, 2023-09-27 at 16:31 +0200, Niklas Schnelle wrote: > On Wed, 2023-09-27 at 15:20 +0200, Niklas Schnelle wrote: > > On Wed, 2023-09-27 at 13:24 +0200, Niklas Schnelle wrote: > > > On Wed, 2023-09-27 at 11:55 +0200, Joerg Roedel wrote: > > > > Hi Niklas, > > > > > > > > On Wed, Sep 27, 2023 at 10:55:23AM +0200, Niklas Schnelle wrote: > > > > > The problem is that something seems to be broken in the iommu/core > > > > > branch. Regardless of whether I have my DMA API conversion on top or > > > > > with the base iommu/core branch I can not use ConnectX-4 VFs. > > > > > > > > Have you already tried to bisect the issue in the iommu/core branch? > > > > The result might sched some light on the issue. > > > > > > > > Regards, > > > > > > > > Joerg > > > > > > Hi Joerg, > > > > > > Working on it, somehow I must have messed up earlier. It now looks like > > > it might in fact be caused by my DMA API conversion rebase and the > > > "s390/pci: Use dma-iommu layer" commit. Maybe there is some interaction > > > with Jason's patches that I haven't thought about. So sorry for any > > > wrong blame. > > > > > > Thanks, > > > Niklas > > > > Hi, > > > > I tracked the problem down from mlx5_core's alloc_cmd_page() via > > dma_alloc_coherent(), ops->alloc, iommu_dma_alloc_remap(), and > > __iommu_dma_alloc_noncontiguous() to a failed iommu_dma_alloc_iova(). > > The allocation here is for 4K so nothing crazy. > > > > On second look I also noticed: > > > > nvme 2007:00:00.0: Using 42-bit DMA addresses > > > > for the NVMe that is working. The problem here seems to be that we set > > iommu_dma_forcedac = true in s390_iommu_probe_finalize() because we > > have currently have a reserved region over the first 4 GiB anyway so > > will always use IOVAs larger than that. That however is too late since > > iommu_dma_set_pci_32bit_workaround() is already checked in > > __iommu_probe_device() which is called just before ops- > > > probe_finalize(). So I moved setting iommu_dma_forcedac = true to > > zpci_init_iommu() and that gets rid of the notice for the NVMe but I > > still get a failure of iommu_dma_alloc_iova() in > > __iommu_dma_alloc_noncontiguous(). So I'll keep digging. > > > > Thanks, > > Niklas > > > Ok I think I got it and this doesn't seem strictly s390x specific but > I'd think should happen with iommu.forcedac=1 everywhere. > > The reason iommu_dma_alloc_iova() fails seems to be that mlx5_core does > dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)) in > mlx5_pci_init()->set_dma_caps() which happens after it already called > mlx5_mdev_init()->mlx5_cmd_init()->alloc_cmd_page() so for the > dma_alloc_coherent() in there the dev->coherent_dma_mask is still > DMA_BIT_MASK(32) for which we can't find an IOVA because well we don't > have IOVAs below 4 GiB. Not entirely sure what caused this not to be > enforced before. > > Thanks, > Niklas > Ok, another update. On trying it out again this problem actually also occurs when applying this v12 on top of v6.6-rc3 too. Also I guess unlike my prior thinking it probably doesn't occur with iommu.forcedac=1 since that still allows IOVAs below 4 GiB and we might be the only ones who don't support those. From my point of view this sounds like a mlx5_core issue they really should call dma_set_mask_and_coherent() before their first call to dma_alloc_coherent() not after. So I guess I'll send a v13 of this series rebased on iommu/core and with an additional mlx5 patch and then let's hope we can get that merged in a way that doesn't leave us with broken ConnectX VFs for too long. Thanks, Niklas