On 2022-04-21 18:35, Serge Semin wrote:
On Thu, Apr 21, 2022 at 04:45:36PM +0200, Christoph Hellwig wrote:
On Wed, Apr 20, 2022 at 11:55:38AM +0300, Serge Semin wrote:
On Wed, Apr 20, 2022 at 10:47:46AM +0200, Christoph Hellwig wrote:
I can't really comment on the dma-ranges exlcusion for P2P mappings,
as that predates my involvedment, however:
My example wasn't specific to the PCIe P2P transfers, but about PCIe
devices reaching some platform devices over the system interconnect
bus.
So strike PCIe, but this our definition of Peer to Peer accesses.
What if I get to have a physical address of a platform device and want
have that device being accessed by a PCIe peripheral device? The
dma_map_resource() seemed very much suitable for that. But considering
what you say it isn't.
dma_map_resource is the right thing for that. But the physical address
of MMIO ranges in the platform device should not have struct pages
allocated for it, and thus the other dma_map_* APIs should not work on
it to start with.
The problem is that the dma_map_resource() won't work for that, but
presumably the dma_map_sg()-like methods will (after some hacking with
the phys address, but anyway). Consider the system diagram in my
previous email. Here is what I would do to initialize a DMA
transaction between a platform device and a PCIe peripheral device:
1) struct resource *rsc = platform_get_resource(plat_dev, IORESOURCE_MEM, 0);
2) dma_addr_t dar = dma_map_resource(&pci_dev->dev, rsc->start, rsc->end - rsc->start + 1,
DMA_FROM_DEVICE, 0);
3) dma_addr_t sar;
void *tmp = dma_alloc_coherent(&pci_dev->dev, PAGE_SIZE, &sar, GFP_KERNEL);
memset(tmp, 0xaa, PAGE_SIZE);
4) PCIe device: DMA.DAR=dar, DMA.SAR=sar. RUN.
If there is no dma-ranges specified in the PCIe Host controller
DT-node, the PCIe peripheral devices will see the rest of the system
memory as is (no offsets and remappings). But if there is dma-ranges
with some specific system settings it may affect the PCIe MRd/MWr TLPs
address translation including the addresses targeted to the MMIO
space. In that case the mapping performed on step 2) will return a
wrong DMA-address since the corresponding dma_direct_map_resource()
just returns the passed physical address missing the
'pci_dev->dma_range_map'-based mapping performed in
translate_phys_to_dma().
Note the mapping on step 3) works correctly because it calls the
translate_phys_to_dma() of the direct DMA interface thus taking the
PCie dma-ranges into account.
To sum up as I see it either restricting dma_map_resource() to map
just the intra-bus addresses was wrong or there must be some
additional mapping infrastructure for the denoted systems. Though I
don't see a way the dma_map_resource() could be fixed to be suitable
for each considered cases.
FWIW the current semantics of dma_map_resource() are basically just to
insert IOMMU awareness where dmaengine drivers were previously just
casting phys_addr_t to dma_addr_t (or u32, or whatever else they put
into their descriptor/register/etc.) IIRC there was a bit of a question
whether it really belonged in the DMA API at all, since it's not really
a "DMA" operation in the conventional sense, and convenience was the
only real deciding argument. The relevant drivers at the time were not
taking dma_pfn_offset into account when consuming physical addresses
directly, so the new API didn't either.
That's just how things got to where they are today. Once again, I'm not
saying that what we have now is necessarily right, or that your change
is necessarily wrong, I just really want to understand specifically
*why* you need to make it, so we can evaluate the risk of possible
breakage either way. Theoretical "if"s aren't really enough.
Robin.