Hi Jim, Jim Quinlan <james.quinlan@xxxxxxxxxxxx> (2024-04-03): > v9 -- v8 was setting an internal bus timeout to accomodate large L1 exit > latencies. After meeting the PCIe HW team it was revealed that the > HW default timeout value was set low for the purposes of HW debugging > convenience; for nominal operation it needs to be set to a higher > value independent of this submission's purpose. This is now a > separate commit. > > -- With v8, Bjorne asked what was preventing a device from exceeding the > time required for the above internal bus timeout. The answer to this > is for us to set the endpoints' max latency {no-,}snoop value to > something below this internal timeout value. If the endpoint > respects this value as it should, it will not send an LTR request > with a larger latency value and not put itself in a situation > that requires more latency than is possible for the platform. > > Typically, ACPI or FW sets these max latency values. In most of our > systems we do not have this happening so it is up to the RC driver to > set these values in the endpoint devices. If the endpoints already > have non-zero values that are lower than what we are setting, we let > them be, as it is possible ACPI or FW set them and knows something > that we do not. > > -- The "clkreq" commit has only been changed to remove the code that was > setting the timeout value, as this code is now its own commit. Given the bot's feedback, I took the liberty of running tests on your patch series except with an extra “static” keyword. On my build system, gcc 12 wasn't complaining about it but I didn't spend time trying to find the right options, or trying a switch to clang to confirm the before/after situation: -void brcm_set_downstream_devs_ltr_max(struct brcm_pcie *pcie) +static void brcm_set_downstream_devs_ltr_max(struct brcm_pcie *pcie) Anyway, this is still: Tested-by: Cyril Brulebois <cyril@xxxxxxxxxxx> Test setup: ----------- - using a $CM with the 20230111 EEPROM - on the same CM4 IO Board - with a $PCIE board (PCIe to multiple USB ports) - and the same Samsung USB flash drive + Logitech keyboard. where $CM is one of: - CM4 Lite Rev 1.0 - CM4 8/32 Rev 1.0 - CM4 4/32 Rev 1.1 and $PCIE is one of: - SupaHub PCE6U1C-R02, VER 006 - SupaHub PCE6U1C-R02, VER 006S - Waveshare VIA VL805/806-based Results: -------- 1. Given this is already v9, and given I don't see how this could have possibly changed, I didn't build or tested an unpatched kernel, which I would still expect to produce either a successful boot *without* seeing the devices plugged on the PCIe-to-USB board or the dreaded SError in most cases. 2. With a patched kernel (v6.7-562-g9f8413c4a66f2 + this series + “static” in front of brcm_set_downstream_devs_ltr_max()), for all $CM/$PCIE combinations, I'm getting a system that boots, sees the flash drive, and gives decent read/write performance on it (plus a functional keyboard). Cheers, -- Cyril Brulebois (kibi@xxxxxxxxxx) <https://debamax.com/> D-I release manager -- Release team member -- Freelance Consultant
Attachment:
signature.asc
Description: PGP signature