Hi Marek, On Sun, Mar 17, 2019 at 12:04 AM Marek Vasut <marek.vasut@xxxxxxxxx> wrote: > On 3/16/19 10:25 PM, Marek Vasut wrote: > > On 3/13/19 7:30 PM, Christoph Hellwig wrote: > >> On Sat, Mar 09, 2019 at 12:23:15AM +0100, Marek Vasut wrote: > >>> On 3/8/19 8:18 AM, Christoph Hellwig wrote: > >>>> On Thu, Mar 07, 2019 at 12:14:06PM +0100, Marek Vasut wrote: > >>>>>> Right, but whoever *interprets* the device masks after the driver has > >>>>>> overridden them should be taking the (smaller) bus mask into account as > >>>>>> well, so the question is where is *that* not being done correctly? > >>>>> > >>>>> Do you have a hint where I should look for that ? > >>>> > >>>> If this a 32-bit ARM platform it might the complete lack of support > >>>> for bus_dma_mask in arch/arm/mm/dma-mapping.c.. > >>> > >>> It's an ARM 64bit platform, just the PCIe controller is limited to 32bit > >>> address range, so the devices on the PCIe bus cannot read the host's > >>> DRAM above the 32bit limit. > >> > >> arm64 should take the mask into account both for the swiotlb and > >> iommu case. What are the exact symptoms you see? > > > > With the nvme, the device is recognized, but cannot be used. > > It boils down to PCI BAR access being possible, since that's all below > > the 32bit boundary, but when the device tries to do any sort of DMA, > > that transfer returns nonsense data. > > > > But when I call dma_set_mask_and_coherent(dev->dev, DMA_BIT_MASK(32) in > > the affected driver (thus far I tried this nvme, xhci-pci and ahci-pci > > drivers), it all starts to work fine. > > > > Could it be that the driver overwrites the (coherent_)dma_mask and > > that's why the swiotlb/iommu code cannot take this into account ? > > > >> Does it involve > >> swiotlb not kicking in, or iommu issues? > > > > How can I check ? I added printks into arch/arm64/mm/dma-mapping.c and > > drivers/iommu/dma-iommu.c , but I suspect I need to look elsewhere. > > Digging further ... > > drivers/nvme/host/pci.c nvme_map_data() calls dma_map_sg_attrs() and the > resulting sglist contains entry with >32bit PA. This is because > dma_map_sg_attrs() calls dma_direct_map_sg(), which in turn calls > dma_direct_map_sg(), then dma_direct_map_page() and that's where it goes > weird. > > dma_direct_map_page() does a dma_direct_possible() check before > triggering swiotlb_map(). The check succeeds, so the later isn't executed. > > dma_direct_possible() calls dma_capable() with dev->dma_mask = > DMA_BIT_MASK(64) and dev->dma_bus_mask = 0, so > min_not_zero(*dev->dma_mask, dev->bus_dma_mask) returns DMA_BIT_MASK(64). > > Surely enough, if I hack dma_direct_possible() to return 0, > swiotlb_map() kicks in and the nvme driver starts working fine. > > I presume the question here is, why is dev->bus_dma_mask = 0 ? Because that's the default, and almost no code overrides that? $ git grep "\<bus_dma_mask =" arch/mips/pci/fixup-sb1250.c: dev->dev.bus_dma_mask = DMA_BIT_MASK(32); arch/x86/kernel/pci-dma.c: pdev->dev.bus_dma_mask = DMA_BIT_MASK(32); drivers/acpi/arm64/iort.c: dev->bus_dma_mask = mask; drivers/of/device.c: dev->bus_dma_mask = mask; dev is the nvme PCI device, I assume? So you can ignore the last match. The first two seem to be related to platforms that cannot do >32 bit DMA on PCI. So that's a hint on how to fix this... Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds