On 10/03/2017 15:06, David Laight wrote: > Robin Murphy wrote: > >> On 09/03/17 23:43, Mason wrote: >> >>> I think I'm making progress, in that I now have a better >>> idea of what I don't understand. So I'm able to ask >>> (hopefully) less vague questions. >>> >>> Take the USB3 PCIe adapter I've been testing with. At some >>> point during init, the XHCI driver request some memory >>> (via kmalloc?) in order to exchange data with the host, right? >>> >>> On my SoC, the RAM used by Linux lives at physical range >>> [0x8000_0000, 0x8800_0000[ => 128 MB >>> >>> How does the XHCI driver make the adapter aware of where >>> it can scribble data? The XHCI driver has no notion that >>> the device is behind a bus, does it? >>> >>> At some point, the physical addresses must be converted >>> to PCI bus addresses, right? Is it computed subtracting >>> the offset defined in the DT? > > The driver should call dma_alloc_coherent() which returns both the > kernel virtual address and the device (xhci controller) has > to use to access it. > The cpu physical address is irrelevant (although it might be > calculated in the middle somewhere). Thank you for that missing piece of the puzzle. I see some relevant action in drivers/usb/host/xhci-mem.c And I now see this log: [ 2.499320] xhci_hcd 0000:01:00.0: // Device context base array address = 0x8e07e000 (DMA), d0855000 (virt) [ 2.509156] xhci_hcd 0000:01:00.0: Allocated command ring at cfb04200 [ 2.515640] xhci_hcd 0000:01:00.0: First segment DMA is 0x8e07f000 [ 2.521863] xhci_hcd 0000:01:00.0: // Setting command ring address to 0x20 [ 2.528786] xhci_hcd 0000:01:00.0: // xHC command ring deq ptr low bits + flags = @00000000 [ 2.537188] xhci_hcd 0000:01:00.0: // xHC command ring deq ptr high bits = @00000000 [ 2.545002] xhci_hcd 0000:01:00.0: // Doorbell array is located at offset 0x800 from cap regs base addr [ 2.554455] xhci_hcd 0000:01:00.0: // xHCI capability registers at d0852000: [ 2.561550] xhci_hcd 0000:01:00.0: // @d0852000 = 0x1000020 (CAPLENGTH AND HCIVERSION) I believe 0x8e07e000 is a CPU address, not a PCI bus address. >>> Then suppose the USB3 card wants to write to an address >>> in RAM. It sends a packet on the PCIe bus, targeting >>> the PCI bus address of that RAM, right? Is this address >>> supposed to be in BAR0 of the root complex? I guess not, >>> since Bjorn said that it was unusual for a RC to have >>> a BAR at all. So I'll hand-wave, and decree that, by some >>> protocol magic, the packet arrives at the PCIe controller. >>> And this controller knows to forward this write request >>> over the memory bus. Does that look about right? >> >> Generally, yes - if an area of memory space *is* claimed by a BAR, then >> another PCI device accessing that would be treated as peer-to-peer DMA, >> which may or may not be allowed (or supported at all). > > So PCIe addresses that refer to the host memory addresses are > just forwarded to the memory subsystem. > In practise this is almost everything. My RC drops packets not targeting its BAR0. > The only other PCIe writes the host will see are likely to be associated > with MIS and MSI-X interrupt support. Rev 1 of the PCIe controller is supposed to forward MSI doorbell writes over the global bus to the PCIe controller's MMIO register. > Some PCIe root complex support peer-to-peer writes but not reads. > Write are normally 'posted' (so are 'fire and forget') reads need the > completion TLP (containing the data) sent back - all hard and difficult. > >> For mem space >> which isn't claimed by BARs, it's up to the RC to decide what to do. As >> a concrete example (which might possibly be relevant) the PLDA XR3-AXI >> IP which we have in the ARM Juno SoC has the ATR_PCIE_WINx registers in >> its root port configuration block that control what ranges of mem space >> are mapped to the external AXI master interface and how. >> >>> My problem is that, in the current implementation of the >>> PCIe controller, the USB device that wants to write to >>> memory is supposed to target BAR0 of the RC. >> >> That doesn't sound right at all. If the RC has a BAR, I'd expect it to >> be for poking the guts of the RC device itself (since this prompted me >> to go and compare, I see the Juno RC does indeed have it own enigmatic >> 16KB BAR, which reads as ever-changing random junk; no idea what that's >> about). >> >>> Since my mem space is limited to 256 MB, then BAR0 is >>> limited to 256 MB (or even 128 MB, since I also need >>> to mapthe device's BAR into the same mem space). >> >> Your window into mem space *from the CPU's point of view* is limited to >> 256MB. The relationship between mem space and the system (AXI) memory >> map from the point of view of PCI devices is a separate issue; if it's >> configurable at all, it probably makes sense to have the firmware set an >> outbound window to at least cover DRAM 1:1, then forget about it (this >> is essentially what Juno UEFI does, for example). > > So you have 128MB (max) of system memory that has cpu physical > addresses 0x80000000 upwards. > I'd expect it all to be accessible from any PCIe card at some PCIe > address, it might be at address 0, 0x80000000 or any other offset. > > I don't know which DT entry controls that offset. This is a crucial point, I think. Regards.