On 10/12/24 15:32, Manivannan Sadhasivam wrote: > On Fri, Oct 11, 2024 at 10:07:30AM +0900, Damien Le Moal wrote: >> On 10/10/24 23:36, Manivannan Sadhasivam wrote: >>> On Mon, Oct 07, 2024 at 01:03:15PM +0900, Damien Le Moal wrote: >>>> Some endpoint controllers have requirements on the alignment of the >>>> controller physical memory address that must be used to map a RC PCI >>>> address region. For instance, the rockchip endpoint controller uses >>>> at most the lower 20 bits of a physical memory address region as the >>>> lower bits of an RC PCI address. For mapping a PCI address region of >>>> size bytes starting from pci_addr, the exact number of address bits >>>> used is the number of address bits changing in the address range >>>> [pci_addr..pci_addr + size - 1]. >>>> >>>> For this example, this creates the following constraints: >>>> 1) The offset into the controller physical memory allocated for a >>>> mapping depends on the mapping size *and* the starting PCI address >>>> for the mapping. >>>> 2) A mapping size cannot exceed the controller windows size (1MB) minus >>>> the offset needed into the allocated physical memory, which can end >>>> up being a smaller size than the desired mapping size. >>>> >>>> Handling these constraints independently of the controller being used >>>> in an endpoint function driver is not possible with the current EPC >>>> API as only the ->align field in struct pci_epc_features is provided >>>> and used for BAR (inbound ATU mappings) mapping. A new API is needed >>>> for function drivers to discover mapping constraints and handle >>>> non-static requirements based on the RC PCI address range to access. >>>> >>>> Introduce the function pci_epc_map_align() and the endpoint controller >>>> operation ->map_align to allow endpoint function drivers to obtain the >>>> size and the offset into a controller address region that must be >>>> allocated and mapped to access an RC PCI address region. The size >>>> of the mapping provided by pci_epc_map_align() can then be used as the >>>> size argument for the function pci_epc_mem_alloc_addr(). >>>> The offset into the allocated controller memory provided can be used to >>>> correctly handle data transfers. >>>> >>>> For endpoint controllers that have PCI address alignment constraints, >>>> pci_epc_map_align() may indicate upon return an effective PCI address >>>> region mapping size that is smaller (but not 0) than the requested PCI >>>> address region size. For such case, an endpoint function driver must >>>> handle data accesses over the desired PCI address range in fragments, >>>> by repeatedly using pci_epc_map_align() over the PCI address range. >>>> >>>> The controller operation ->map_align is optional: controllers that do >>>> not have any alignment constraints for mapping a RC PCI address region >>>> do not need to implement this operation. For such controllers, >>>> pci_epc_map_align() always returns the mapping size as equal to the >>>> requested size of the PCI region and an offset equal to 0. >>>> >>>> The new structure struct pci_epc_map is introduced to represent a >>>> mapping start PCI address, mapping effective size, the size and offset >>>> into the controller memory needed for mapping the PCI address region as >>>> well as the physical and virtual CPU addresses of the mapping (phys_base >>>> and virt_base fields). For convenience, the physical and virtual CPU >>>> addresses within that mapping to access the target RC PCI address region >>>> are also provided (phys_addr and virt_addr fields). >>>> >>> >>> I'm fine with the concept of this patch, but I don't get why you need an API for >>> this and not just a callback to be used in the pci_epc_mem_{map/unmap} APIs. >>> Furthermore, I don't see an user of this API (in 3 series you've sent out so >>> far). Let me know if I failed to spot it. >>> >>> Also, the API name pci_epc_map_align() sounds like it does the mapping, but it >>> doesn't. So I'd not have it exposed as an API at all. >> >> OK. Fine with me. I will move this inside pci_epc_mem_map(). But note that >> without this function, pci_epc_mem_alloc_addr() and pci_epc_map_addr() are >> totally useless for EP controllers that have a mapping alignment requirement, >> which without the pci_epc_map_align() function, an endpoint function driver >> cannot discover *at all* currently. That does not fix the overall API of EPC... >> > > Not at all. EPF drivers still can use "epf_mhi->epc_features->align" to discover > the alignment requirement and calculate the offset on their own (please see > pci-epf-mhi). But I'm not in favor of that approach since the APIs need to do > that job and that's why I like your pci_epc_mem_map() API. That is *not* correct, at least in general. For two reasons: 1) epc_features->align defines alignment for BARs, that is, inbound windows memory. It is not supposed to be about the outbound windows for mapping PCI address space for doing mmio or DMA. Some controllers may have the same alignment constraint for both ib and ob, in which case things will work, but that is "just being lucky". I spent weeks with the RK3399 understanding that I was not lucky with that one :) 2) A static alignment constraint does not work for all controllers. C.f. my series fixing the RK3399 were I think I clearly explain that alignment of a mapping depends on the PCI address AND the size being mapped, as both determine the number of bits of address changing within the PCI address range to access. Using a fixed boundary alignment for the RK3399 simply does not work at all. An epf cannot know that simply looking at a fixed value... What you said may be true for the mhi epf, because it requires special hardware that has a simple fixed alignment constraint. ntb and vntb are also coded assuming such constraint. So If I try to run ntb or vntg on the RK3399 it will likely not work (actually it may, but out of sheer luck given that the addresses that will be mapped will likely be aligned to 1MB, that is, the memory window size). Developping the nvme epf driver where I was seeing completely random PCI addresses for command buffers, I could make things work only after developping the pci_epc_mem_map() with the controller operation telling the mapping (.get_mem_map()) for every address to map. > >> By not having pci_epc_map_align(), pci_epc_mem_alloc_addr() and >> pci_epc_map_addr() remain broken, but the introduction of pci_epc_mem_map() does >> provide a working solution for the general case. >> >> So I think we will still need to do something about this bad state of the API later. >> > > We can always rework the APIs to incorporate the alignment requirement. See above. An API that advertise a simple alignment requirement will not work for all controllers... But anyway, given that we are not getting any problem report, people using the EP framework likely have setups that combine controllers and endpoint drivers playing well together. So I do not think there is any urgency about the API. I really do need this series for the nvme endpoint driver though, as a first step for the API improvement. -- Damien Le Moal Western Digital Research