On 3/17/19 11:29 AM, Geert Uytterhoeven wrote: > Hi Marek, Hi, > On Sun, Mar 17, 2019 at 12:04 AM Marek Vasut <marek.vasut@xxxxxxxxx> wrote: >> On 3/16/19 10:25 PM, Marek Vasut wrote: >>> On 3/13/19 7:30 PM, Christoph Hellwig wrote: >>>> On Sat, Mar 09, 2019 at 12:23:15AM +0100, Marek Vasut wrote: >>>>> On 3/8/19 8:18 AM, Christoph Hellwig wrote: >>>>>> On Thu, Mar 07, 2019 at 12:14:06PM +0100, Marek Vasut wrote: >>>>>>>> Right, but whoever *interprets* the device masks after the driver has >>>>>>>> overridden them should be taking the (smaller) bus mask into account as >>>>>>>> well, so the question is where is *that* not being done correctly? >>>>>>> >>>>>>> Do you have a hint where I should look for that ? >>>>>> >>>>>> If this a 32-bit ARM platform it might the complete lack of support >>>>>> for bus_dma_mask in arch/arm/mm/dma-mapping.c.. >>>>> >>>>> It's an ARM 64bit platform, just the PCIe controller is limited to 32bit >>>>> address range, so the devices on the PCIe bus cannot read the host's >>>>> DRAM above the 32bit limit. >>>> >>>> arm64 should take the mask into account both for the swiotlb and >>>> iommu case. What are the exact symptoms you see? >>> >>> With the nvme, the device is recognized, but cannot be used. >>> It boils down to PCI BAR access being possible, since that's all below >>> the 32bit boundary, but when the device tries to do any sort of DMA, >>> that transfer returns nonsense data. >>> >>> But when I call dma_set_mask_and_coherent(dev->dev, DMA_BIT_MASK(32) in >>> the affected driver (thus far I tried this nvme, xhci-pci and ahci-pci >>> drivers), it all starts to work fine. >>> >>> Could it be that the driver overwrites the (coherent_)dma_mask and >>> that's why the swiotlb/iommu code cannot take this into account ? >>> >>>> Does it involve >>>> swiotlb not kicking in, or iommu issues? >>> >>> How can I check ? I added printks into arch/arm64/mm/dma-mapping.c and >>> drivers/iommu/dma-iommu.c , but I suspect I need to look elsewhere. >> >> Digging further ... >> >> drivers/nvme/host/pci.c nvme_map_data() calls dma_map_sg_attrs() and the >> resulting sglist contains entry with >32bit PA. This is because >> dma_map_sg_attrs() calls dma_direct_map_sg(), which in turn calls >> dma_direct_map_sg(), then dma_direct_map_page() and that's where it goes >> weird. >> >> dma_direct_map_page() does a dma_direct_possible() check before >> triggering swiotlb_map(). The check succeeds, so the later isn't executed. >> >> dma_direct_possible() calls dma_capable() with dev->dma_mask = >> DMA_BIT_MASK(64) and dev->dma_bus_mask = 0, so >> min_not_zero(*dev->dma_mask, dev->bus_dma_mask) returns DMA_BIT_MASK(64). >> >> Surely enough, if I hack dma_direct_possible() to return 0, >> swiotlb_map() kicks in and the nvme driver starts working fine. >> >> I presume the question here is, why is dev->bus_dma_mask = 0 ? > > Because that's the default, and almost no code overrides that? But shouldn't drivers/of/device.c set that for the PCIe controller ? > $ git grep "\<bus_dma_mask =" > arch/mips/pci/fixup-sb1250.c: dev->dev.bus_dma_mask = > DMA_BIT_MASK(32); > arch/x86/kernel/pci-dma.c: pdev->dev.bus_dma_mask = DMA_BIT_MASK(32); > drivers/acpi/arm64/iort.c: dev->bus_dma_mask = mask; > drivers/of/device.c: dev->bus_dma_mask = mask; > > dev is the nvme PCI device, I assume? So you can ignore the last match. > > The first two seem to be related to platforms that cannot do >32 bit DMA > on PCI. So that's a hint on how to fix this... That doesn't feel right, it's not a platform limitation, but a PCIe IP limitation, so this fix should live somewhere in drivers/ I think ? -- Best regards, Marek Vasut