Re: [PATCH V2] PCI: dwc ep: cache config until DBI regs available

Kishon Vijay Abraham I <kishon@xxxxxx> · Fri, 4 Jan 2019 13:32:10 +0530

Hi Stephen,

On 02/01/19 10:04 PM, Stephen Warren wrote:
> On 12/19/18 7:37 AM, Kishon Vijay Abraham I wrote:
>> Hi,
>>
>> On 14/12/18 10:31 PM, Stephen Warren wrote:
>>> On 12/11/18 10:23 AM, Stephen Warren wrote:
>>>> On 12/10/18 9:36 PM, Kishon Vijay Abraham I wrote:
>>>>> Hi,
>>>>>
>>>>> On 27/11/18 4:39 AM, Stephen Warren wrote:
>>>>>> From: Stephen Warren <swarren@xxxxxxxxxx>
>>>>>>
>>>>>> Some implementations of the DWC PCIe endpoint controller do not allow
>>>>>> access to DBI registers until the attached host has started REFCLK,
>>>>>> released PERST, and the endpoint driver has initialized clocking of the
>>>>>> DBI registers based on that. One such system is NVIDIA's T194 SoC. The
>>>>>> PCIe endpoint subsystem and DWC driver currently don't work on such
>>>>>> hardware, since they assume that all endpoint configuration can happen
>>>>>> at any arbitrary time.
>>>>>>
>>>>>> Enhance the DWC endpoint driver to support such systems by caching all
>>>>>> endpoint configuration in software, and only writing the configuration
>>>>>> to hardware once it's been initialized. This is implemented by splitting
>>>>>> all endpoint controller ops into two functions; the first which simply
>>>>>> records/caches the desired configuration whenever called by the
>>>>>> associated function driver and optionally calls the second, and the
>>>>>> second which actually programs the configuration into hardware, which
>>>>>> may be called either by the first function, or later when it's known
>>>>>> that the DBI registers are available.
>>>>
>>>>>> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c
>>>>>> b/drivers/pci/controller/dwc/pcie-designware-ep.c
>>>>
>>>>>> +void dw_pcie_set_regs_available(struct dw_pcie *pci)
>>>>>> +{
>>>>>
>>>>> When will this function be invoked? Does the wrapper get an interrupt when
>>>>> refclk is enabled where this function will be invoked?
>>>>
>>>> Yes, there's an IRQ from the HW that indicates when PEXRST is released. I
>>>> don't recall right now if this IRQ is something that exists for all DWC
>>>> instantiations, or is Tegra-specific.
>>>>
>>>>>> +    struct dw_pcie_ep *ep = &(pci->ep);
>>>>>> +    int i;
>>>>>> +
>>>>>> +    ep->hw_regs_not_available = false;
>>>>>
>>>>> This can race with epc_ops.
>>>>>
>>>>>> +
>>>>>> +    dw_pcie_ep_write_header_regs(ep);
>>>>>> +    for_each_set_bit(i, ep->ib_window_map, ep->num_ib_windows) {
>>>>>> +        dw_pcie_prog_inbound_atu(pci, i,
>>>>>> +            ep->cached_inbound_atus[i].bar,
>>>>>> +            ep->cached_inbound_atus[i].cpu_addr,
>>>>>> +            ep->cached_inbound_atus[i].as_type);
>>>>>
>>>>> Depending on the context in which this function is invoked, programming
>>>>> inbound/outbound ATU can also race with EPC ops.
>>>>   >
>>>>>> +        dw_pcie_ep_set_bar_regs(ep, 0, ep->cached_inbound_atus[i].bar);
>>>>>> +    }
>>>>>> +    for_each_set_bit(i, ep->ob_window_map, ep->num_ob_windows)
>>>>>> +        dw_pcie_prog_outbound_atu(pci, i, PCIE_ATU_TYPE_MEM,
>>>>>> +            ep->cached_outbound_atus[i].addr,
>>>>>> +            ep->cached_outbound_atus[i].pci_addr,
>>>>>> +            ep->cached_outbound_atus[i].size);
>>>>>> +    dw_pcie_dbi_ro_wr_en(pci);
>>>>>> +    dw_pcie_writew_dbi(pci, PCI_MSI_FLAGS, ep->cached_msi_flags);
>>>>>> +    dw_pcie_writew_dbi(pci, PCI_MSIX_FLAGS, ep->cached_msix_flags);
>>>>>> +    dw_pcie_dbi_ro_wr_dis(pci);
>>>>>
>>>>> IMHO we should add a new epc ops ->epc_init() which indicates if the EPC is
>>>>> ready to be initialized or not. Only if the epc_init indicates it's ready
>>>>> to be
>>>>> initialized, the endpoint function driver should go ahead with further
>>>>> initialization. Or else it should wait for a notification from EPC to
>>>>> indicate
>>>>> when it's ready to be initialized.
>>>>
>>>> (Did you mean epf op or epc op?)
>>>>
>>>> I'm not sure how exactly how that would work; do you want the DWC core driver
>>>> or the endpoint subsystem to poll that epc op to find out when the HW is
>>>> ready to be initialized? Or do you envisage the controller driver still
>>>> calling dw_pcie_set_regs_available() (possibly renamed), which in turn calls
>>>> ->epc_init() calls for some reason?
>>>>
>>>> If you don't want to cache the endpoint configuration, perhaps you want:
>>>>
>>>> a) Endpoint function doesn't pro-actively call the endpoint controller
>>>> functions to configure the endpoint.
>>>>
>>>> b) When endpoint HW is ready, the relevant driver calls pci_epc_ready() (or
>>>> whatever name), which lets the core know the HW can be configured. Perhaps
>>>> this schedules a work queue item to implement locking to avoid the races you
>>>> mentioned.
>>>>
>>>> c) Endpoint core calls pci_epf_init() which calls epf op ->init().
>>>>
>>>> One gotcha with this approach, which the caching approach helps avoid:
>>>>
>>>> Once PEXRST is released, the system must respond to PCIe enumeration requests
>>>> within 50ms. Thus, SW must very quickly respond to the IRQ indicating PEXRST
>>>> release and program the endpoint configuration into HW. By caching the
>>>> configuration in the DWC driver and immediately/synchronously applying it in
>>>> the PEXRST IRQ handler, we reduce the number of steps and amount of code
>>>> taken to program the HW, so it should get done pretty quickly. If instead we
>>>> call back into the endpoint function driver's ->init() op, we run the risk of
>>>> that op doing other stuff besides just calling the endpoint HW configuration
>>>> APIs (e.g. perhaps the function driver defers memory buffer allocation or
>>>> IOVA programming to that ->init function) which in turns makes it much less
>>>> likely the 50ms requirement will be hit. Perhaps we can solve this by naming
>>>> the op well and providing lots of comments, but my guess is that endpoint
>>>> function authors won't notice that...
>>>
>>> Kishon,
>>>
>>> Do you have any further details exactly how you'd prefer this to work? Does the
>>> approach I describe in points a/b/c above sound like what you want? Thanks.
>>
>> Agree with your PERST comment.
>>
>> What I have in mind is we add a new epc_init() ops. I feel there are more uses
>> for it (For e.g I have an internal patch which uses epc_init to initialize DMA.
>> Hopefully I'll post it soon).
>> If you look at pci_epf_test, pci_epf_test_bind() is where the function actually
>> starts to write to HW (i.e using pci_epc_*).
>> So before the endpoint function invokes pci_epc_write_header(), it should
>> invoke epc_init(). Only if that succeeds, it should go ahead with other
>> initialization.
>> If epc_init_* fails, we can have a particular error value to indicate the
>> controller is waiting for clock from host (so that we don't return error from
>> ->bind()). Once the controller receives the clock, it can send an atomic
>> notification to the endpoint function driver to indicate it is ready to be
>> initialized. (Atomic notification makes it easy to handle for multi function
>> endpoint devices.)
>> The endpoint function can then initialize the controller.
>> I think except for pci_epf_test_alloc_space() all other functions are
>> configuring the HW (in pci_epf_test_bind). pci_epf_test_alloc_space() could be
>> moved to pci_epf_test_probe() so there are no expensive operations to be done
>> once the controller is ready to be initialized.
>> I have epc_init() and the atomic notification part already implemented and I'm
>> planning to post it before next week. Once that is merged, we might have to
>> reorder function in pci_epf_test driver and you have to return the correct
>> error value for epc_init() if the clock is not there.
> 
> Kishon, did you manage to post the patches that implement epc_init()? If so, a
> link would be appreciated. Thanks.

I haven't posted the patches yet. Sorry for the delay. Give me some more time
please (till next week).

Thanks
Kishon