Hi Stephen, On 02/01/19 10:04 PM, Stephen Warren wrote: > On 12/19/18 7:37 AM, Kishon Vijay Abraham I wrote: >> Hi, >> >> On 14/12/18 10:31 PM, Stephen Warren wrote: >>> On 12/11/18 10:23 AM, Stephen Warren wrote: >>>> On 12/10/18 9:36 PM, Kishon Vijay Abraham I wrote: >>>>> Hi, >>>>> >>>>> On 27/11/18 4:39 AM, Stephen Warren wrote: >>>>>> From: Stephen Warren <swarren@xxxxxxxxxx> >>>>>> >>>>>> Some implementations of the DWC PCIe endpoint controller do not allow >>>>>> access to DBI registers until the attached host has started REFCLK, >>>>>> released PERST, and the endpoint driver has initialized clocking of the >>>>>> DBI registers based on that. One such system is NVIDIA's T194 SoC. The >>>>>> PCIe endpoint subsystem and DWC driver currently don't work on such >>>>>> hardware, since they assume that all endpoint configuration can happen >>>>>> at any arbitrary time. >>>>>> >>>>>> Enhance the DWC endpoint driver to support such systems by caching all >>>>>> endpoint configuration in software, and only writing the configuration >>>>>> to hardware once it's been initialized. This is implemented by splitting >>>>>> all endpoint controller ops into two functions; the first which simply >>>>>> records/caches the desired configuration whenever called by the >>>>>> associated function driver and optionally calls the second, and the >>>>>> second which actually programs the configuration into hardware, which >>>>>> may be called either by the first function, or later when it's known >>>>>> that the DBI registers are available. >>>> >>>>>> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c >>>>>> b/drivers/pci/controller/dwc/pcie-designware-ep.c >>>> >>>>>> +void dw_pcie_set_regs_available(struct dw_pcie *pci) >>>>>> +{ >>>>> >>>>> When will this function be invoked? Does the wrapper get an interrupt when >>>>> refclk is enabled where this function will be invoked? >>>> >>>> Yes, there's an IRQ from the HW that indicates when PEXRST is released. I >>>> don't recall right now if this IRQ is something that exists for all DWC >>>> instantiations, or is Tegra-specific. >>>> >>>>>> + struct dw_pcie_ep *ep = &(pci->ep); >>>>>> + int i; >>>>>> + >>>>>> + ep->hw_regs_not_available = false; >>>>> >>>>> This can race with epc_ops. >>>>> >>>>>> + >>>>>> + dw_pcie_ep_write_header_regs(ep); >>>>>> + for_each_set_bit(i, ep->ib_window_map, ep->num_ib_windows) { >>>>>> + dw_pcie_prog_inbound_atu(pci, i, >>>>>> + ep->cached_inbound_atus[i].bar, >>>>>> + ep->cached_inbound_atus[i].cpu_addr, >>>>>> + ep->cached_inbound_atus[i].as_type); >>>>> >>>>> Depending on the context in which this function is invoked, programming >>>>> inbound/outbound ATU can also race with EPC ops. >>>> > >>>>>> + dw_pcie_ep_set_bar_regs(ep, 0, ep->cached_inbound_atus[i].bar); >>>>>> + } >>>>>> + for_each_set_bit(i, ep->ob_window_map, ep->num_ob_windows) >>>>>> + dw_pcie_prog_outbound_atu(pci, i, PCIE_ATU_TYPE_MEM, >>>>>> + ep->cached_outbound_atus[i].addr, >>>>>> + ep->cached_outbound_atus[i].pci_addr, >>>>>> + ep->cached_outbound_atus[i].size); >>>>>> + dw_pcie_dbi_ro_wr_en(pci); >>>>>> + dw_pcie_writew_dbi(pci, PCI_MSI_FLAGS, ep->cached_msi_flags); >>>>>> + dw_pcie_writew_dbi(pci, PCI_MSIX_FLAGS, ep->cached_msix_flags); >>>>>> + dw_pcie_dbi_ro_wr_dis(pci); >>>>> >>>>> IMHO we should add a new epc ops ->epc_init() which indicates if the EPC is >>>>> ready to be initialized or not. Only if the epc_init indicates it's ready >>>>> to be >>>>> initialized, the endpoint function driver should go ahead with further >>>>> initialization. Or else it should wait for a notification from EPC to >>>>> indicate >>>>> when it's ready to be initialized. >>>> >>>> (Did you mean epf op or epc op?) >>>> >>>> I'm not sure how exactly how that would work; do you want the DWC core driver >>>> or the endpoint subsystem to poll that epc op to find out when the HW is >>>> ready to be initialized? Or do you envisage the controller driver still >>>> calling dw_pcie_set_regs_available() (possibly renamed), which in turn calls >>>> ->epc_init() calls for some reason? >>>> >>>> If you don't want to cache the endpoint configuration, perhaps you want: >>>> >>>> a) Endpoint function doesn't pro-actively call the endpoint controller >>>> functions to configure the endpoint. >>>> >>>> b) When endpoint HW is ready, the relevant driver calls pci_epc_ready() (or >>>> whatever name), which lets the core know the HW can be configured. Perhaps >>>> this schedules a work queue item to implement locking to avoid the races you >>>> mentioned. >>>> >>>> c) Endpoint core calls pci_epf_init() which calls epf op ->init(). >>>> >>>> One gotcha with this approach, which the caching approach helps avoid: >>>> >>>> Once PEXRST is released, the system must respond to PCIe enumeration requests >>>> within 50ms. Thus, SW must very quickly respond to the IRQ indicating PEXRST >>>> release and program the endpoint configuration into HW. By caching the >>>> configuration in the DWC driver and immediately/synchronously applying it in >>>> the PEXRST IRQ handler, we reduce the number of steps and amount of code >>>> taken to program the HW, so it should get done pretty quickly. If instead we >>>> call back into the endpoint function driver's ->init() op, we run the risk of >>>> that op doing other stuff besides just calling the endpoint HW configuration >>>> APIs (e.g. perhaps the function driver defers memory buffer allocation or >>>> IOVA programming to that ->init function) which in turns makes it much less >>>> likely the 50ms requirement will be hit. Perhaps we can solve this by naming >>>> the op well and providing lots of comments, but my guess is that endpoint >>>> function authors won't notice that... >>> >>> Kishon, >>> >>> Do you have any further details exactly how you'd prefer this to work? Does the >>> approach I describe in points a/b/c above sound like what you want? Thanks. >> >> Agree with your PERST comment. >> >> What I have in mind is we add a new epc_init() ops. I feel there are more uses >> for it (For e.g I have an internal patch which uses epc_init to initialize DMA. >> Hopefully I'll post it soon). >> If you look at pci_epf_test, pci_epf_test_bind() is where the function actually >> starts to write to HW (i.e using pci_epc_*). >> So before the endpoint function invokes pci_epc_write_header(), it should >> invoke epc_init(). Only if that succeeds, it should go ahead with other >> initialization. >> If epc_init_* fails, we can have a particular error value to indicate the >> controller is waiting for clock from host (so that we don't return error from >> ->bind()). Once the controller receives the clock, it can send an atomic >> notification to the endpoint function driver to indicate it is ready to be >> initialized. (Atomic notification makes it easy to handle for multi function >> endpoint devices.) >> The endpoint function can then initialize the controller. >> I think except for pci_epf_test_alloc_space() all other functions are >> configuring the HW (in pci_epf_test_bind). pci_epf_test_alloc_space() could be >> moved to pci_epf_test_probe() so there are no expensive operations to be done >> once the controller is ready to be initialized. >> I have epc_init() and the atomic notification part already implemented and I'm >> planning to post it before next week. Once that is merged, we might have to >> reorder function in pci_epf_test driver and you have to return the correct >> error value for epc_init() if the clock is not there. > > Kishon, did you manage to post the patches that implement epc_init()? If so, a > link would be appreciated. Thanks. I haven't posted the patches yet. Sorry for the delay. Give me some more time please (till next week). Thanks Kishon