Re: [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 10, 2024 at 10:37:30AM -0800, Yidong Zhang wrote:
> AMD Versal based PCIe card, including V70, is designed for AI inference
> efficiency and is tuned for video analytics and natural language processing
> applications.
> 
> The driver architecture:
> 
>   +---------+  Communication +---------+  Remote  +-----+------+
>   |         |  Channel       |         |  Queue   |     |      |
>   | User PF | <============> | Mgmt PF | <=======>| FW  | FPGA |
>   +---------+                +---------+          +-----+------+
>     PL Data                    base FW
>                                APU FW
>                                PL Data (copy)
>  - PL (FPGA Program Logic)
>  - FW (Firmware)
> 
> There are 2 separate drivers from the original XRT[1] design.
>  - UserPF driver
>  - MgmtPF driver
> 
> The new AMD versal-pci driver will replace the MgmtPF driver for Versal
> PCIe card.
> 
> The XRT[1] is already open-sourced. It includes solution of runtime for
> many different type of PCIe Based cards. It also provides utilities for
> managing and programming the devices.
> 
> The AMD versal-pci stands for AMD Versal brand PCIe device management
> driver. This driver provides the following functionalities:
> 
>    - module and PCI device initialization
>      this driver will attach to specific device id of V70 card;
>      the driver will initialize itself based on bar resources for
>      - communication channel:
>        a hardware message service between mgmt PF and user PF
>      - remote queue:
>        a hardware queue based ring buffer service between mgmt PF and PCIe
>        hardware firmware for programming FPGA Program Logic, loading
>        firmware and checking card healthy status.
> 
>    - programming FW
>      - The base FW is downloaded onto the flash of the card.
>      - The APU FW is downloaded once after a POR (power on reset).
>      - Reloading the MgmtPF driver will not change any existing hardware.
> 
>    - programming FPGA hardware binaries - PL Data
>     - using fpga framework ops to support re-programing FPGA
>     - the re-programming request will be initiated from the existing UserPF
>       driver only, and the MgmtPF driver load the matched PL Data after
>       receiving request from the communication channel. The matching PL

I think this is not the way the FPGA generic framework should do. A FPGA
region user (your userPF driver) should not also be the reprogram requester.
The user driver cannot deal with the unexpected HW change if it happens.
Maybe after reprogramming, the user driver cannot match the device
anymore, and if user driver is still working on it, crash.

The expected behavior is, the FPGA region removes user devices (thus
detaches user drivers), does reprogramming, re-enumerates/rescans and
matches new devices with new drivers. And I think that's what Nava is
working on.

BTW, AFAICS the expected flow is easier to implement for of-fpga-region,
but harder for PCI devices. But I think that's the right direction and
should try to work it out.

Thanks,
Yilun




[Index of Archives]     [LM Sensors]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux