Re: [PATCH V2 1/4] drivers/fpga/amd: Add new driver amd versal-pci

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 1/26/25 02:32, Xu Yilun wrote:
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.


On Tue, Dec 10, 2024 at 10:37:30AM -0800, Yidong Zhang wrote:
AMD Versal based PCIe card, including V70, is designed for AI inference
efficiency and is tuned for video analytics and natural language processing
applications.

The driver architecture:

   +---------+  Communication +---------+  Remote  +-----+------+
   |         |  Channel       |         |  Queue   |     |      |
   | User PF | <============> | Mgmt PF | <=======>| FW  | FPGA |
   +---------+                +---------+          +-----+------+
     PL Data                    base FW
                                APU FW
                                PL Data (copy)
  - PL (FPGA Program Logic)
  - FW (Firmware)

There are 2 separate drivers from the original XRT[1] design.
  - UserPF driver
  - MgmtPF driver

The new AMD versal-pci driver will replace the MgmtPF driver for Versal
PCIe card.

The XRT[1] is already open-sourced. It includes solution of runtime for
many different type of PCIe Based cards. It also provides utilities for
managing and programming the devices.

The AMD versal-pci stands for AMD Versal brand PCIe device management
driver. This driver provides the following functionalities:

    - module and PCI device initialization
      this driver will attach to specific device id of V70 card;
      the driver will initialize itself based on bar resources for
      - communication channel:
        a hardware message service between mgmt PF and user PF
      - remote queue:
        a hardware queue based ring buffer service between mgmt PF and PCIe
        hardware firmware for programming FPGA Program Logic, loading
        firmware and checking card healthy status.

    - programming FW
      - The base FW is downloaded onto the flash of the card.
      - The APU FW is downloaded once after a POR (power on reset).
      - Reloading the MgmtPF driver will not change any existing hardware.

    - programming FPGA hardware binaries - PL Data
     - using fpga framework ops to support re-programing FPGA
     - the re-programming request will be initiated from the existing UserPF
       driver only, and the MgmtPF driver load the matched PL Data after
       receiving request from the communication channel. The matching PL

I think this is not the way the FPGA generic framework should do. A FPGA
region user (your userPF driver) should not also be the reprogram requester.
The user driver cannot deal with the unexpected HW change if it happens.
Maybe after reprogramming, the user driver cannot match the device
anymore, and if user driver is still working on it, crash.

One thing to clarify. The current design is:

The userPF driver is the only requester. The mgmtPF has no uAPI to reprogram the FPGA.



The expected behavior is, the FPGA region removes user devices (thus
detaches user drivers), does reprogramming, re-enumerates/rescans and
matches new devices with new drivers. And I think that's what Nava is
working on.


Nava's work is different than our current design, our current design is:

the separate userPF driver will detach all services before requesting to the mgmtPF to program the FPGA, and after the programming is done, the userPF will re-enumerate/rescan the matching new devices.

The mgmtPF is a util driver which is responsible for communicating with the mgmtPF PCIe bar resources.


BTW, AFAICS the expected flow is easier to implement for of-fpga-region,
but harder for PCI devices. But I think that's the right direction and
should try to work it out.

Could I recap the suggested design if I understand that correctly...

You are thinking that the mgmtPF (aka. versal-pci) driver should have a uAPI to trigger the FPGA re-programing; and using Nava's callback ops to detach the separate userPF driver; after re-programing is done, re-attch the userPF driver and allow the userPF driver re-enumerate all to match the new hardware.

I think my understanding is correct, it is doable.

As long as we can keep our userPF driver as separate driver, the code change won't be too big.


Thanks,
Yilun




[Index of Archives]     [LM Sensors]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux