Re: [PATCH V5 00/10] AMD XDNA driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 10/25/24 10:55, Jeffrey Hugo wrote:
On 10/21/2024 10:19 AM, Lizhi Hou wrote:
This patchset introduces a new Linux Kernel Driver, amdxdna for AMD NPUs.
The driver is based on Linux accel subsystem.

NPU (Neural Processing Unit) is an AI inference accelerator integrated
into AMD client CPUs. NPU enables efficient execution of Machine Learning
applications like CNNs, LLMs, etc.  NPU is based on AMD XDNA
architecture [1].

AMD NPU consists of the following components:

   - Tiled array of AMD AI Engine processors.
   - Micro Controller which runs the NPU Firmware responsible for
     command processing, AIE array configuration, and execution management.
   - PCI EP for host control of the NPU device.
   - Interconnect for connecting the NPU components together.
   - SRAM for use by the NPU Firmware.
   - Address translation hardware for protected host memory access by the
     NPU.

NPU supports multiple concurrent fully isolated contexts. Concurrent
contexts may be bound to AI Engine array spatially and or temporarily.

The driver is licensed under GPL-2.0 except for UAPI header which is
licensed GPL-2.0 WITH Linux-syscall-note.

User mode driver stack consists of XRT [2] and AMD AIE Plugin for IREE [3].

The firmware for the NPU is distributed as a closed source binary, and has
already been pushed to the DRM firmware repository [4].

[1]https://www.amd.com/en/technologies/xdna.html
[2]https://github.com/Xilinx/XRT
[3]https://github.com/nod-ai/iree-amd-aie
[4]https://gitlab.freedesktop.org/drm/firmware/-/tree/amd-ipu-staging/amdnpu

Changes since v4:
- Fix lockdep errors
- Use __u* structure for struct aie_error

One nit, when you send the next version would you please either To: or Cc: me on the entire series?  I only get pieces in my inbox which is mildly annoying on my end.
Sure.

Looks like we are getting close here.  One procedural question I have, do you have commit permissions to drm-misc?
No, I do not have commit permissions yet.

I applied the series to drm-misc-next and tried to build.  Got the following errors -

Could you share the build command line? So I can reproduce and verify my fix.

I used "make M=drivers/accel/amdxdna" and did not reproduce the error with drm-misc-next. It looks build robot did not complain with the patch neither.

$ git branch
* drm-misc-next
$ make M=drivers/accel/amdxdna
  CC [M]  drivers/accel/amdxdna/aie2_ctx.o
  CC [M]  drivers/accel/amdxdna/aie2_error.o
  CC [M]  drivers/accel/amdxdna/aie2_message.o
  CC [M]  drivers/accel/amdxdna/aie2_pci.o
  CC [M]  drivers/accel/amdxdna/aie2_psp.o
  CC [M]  drivers/accel/amdxdna/aie2_smu.o
  CC [M]  drivers/accel/amdxdna/aie2_solver.o
  CC [M]  drivers/accel/amdxdna/amdxdna_ctx.o
  CC [M]  drivers/accel/amdxdna/amdxdna_gem.o
  CC [M]  drivers/accel/amdxdna/amdxdna_mailbox.o
  CC [M]  drivers/accel/amdxdna/amdxdna_mailbox_helper.o
  CC [M]  drivers/accel/amdxdna/amdxdna_pci_drv.o
  CC [M]  drivers/accel/amdxdna/amdxdna_sysfs.o
  CC [M]  drivers/accel/amdxdna/npu1_regs.o
  CC [M]  drivers/accel/amdxdna/npu2_regs.o
  CC [M]  drivers/accel/amdxdna/npu4_regs.o
  CC [M]  drivers/accel/amdxdna/npu5_regs.o
  LD [M]  drivers/accel/amdxdna/amdxdna.o
  MODPOST drivers/accel/amdxdna/Module.symvers
  CC [M]  drivers/accel/amdxdna/amdxdna.mod.o
  CC [M]  drivers/accel/amdxdna/.module-common.o
  LD [M]  drivers/accel/amdxdna/amdxdna.ko
$


  CC [M]  drivers/accel/amdxdna/aie2_ctx.o
  CC [M]  drivers/accel/amdxdna/aie2_error.o
  CC [M]  drivers/accel/amdxdna/aie2_message.o
  CC [M]  drivers/accel/amdxdna/aie2_pci.o
  CC [M]  drivers/accel/amdxdna/aie2_psp.o
  CC [M]  drivers/accel/amdxdna/aie2_smu.o
  CC [M]  drivers/accel/amdxdna/aie2_solver.o
  CC [M]  drivers/accel/amdxdna/amdxdna_ctx.o
  CC [M]  drivers/accel/amdxdna/amdxdna_gem.o
  CC [M]  drivers/accel/amdxdna/amdxdna_mailbox.o
  CC [M]  drivers/accel/amdxdna/amdxdna_mailbox_helper.o
  CC [M]  drivers/accel/amdxdna/amdxdna_pci_drv.o
  CC [M]  drivers/accel/amdxdna/amdxdna_sysfs.o
  CC [M]  drivers/accel/amdxdna/npu1_regs.o
  CC [M]  drivers/accel/amdxdna/npu2_regs.o
  CC [M]  drivers/accel/amdxdna/npu4_regs.o
  CC [M]  drivers/accel/amdxdna/npu5_regs.o
  AR      drivers/base/firmware_loader/built-in.a
  AR      drivers/base/built-in.a
In file included from drivers/accel/amdxdna/aie2_message.c:19:
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit declaration of function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
  112 |         return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
      |                ^~~~~~~~~
In file included from drivers/accel/amdxdna/amdxdna_gem.c:15:
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit declaration of function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
  112 |         return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
      |                ^~~~~~~~~
In file included from drivers/accel/amdxdna/aie2_psp.c:11:
drivers/accel/amdxdna/aie2_psp.c: In function ‘psp_exec’:
drivers/accel/amdxdna/aie2_psp.c:62:34: error: implicit declaration of function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
   62 | FIELD_GET(PSP_STATUS_READY, ready),
      |                                  ^~~~~~~~~
./include/linux/iopoll.h:47:21: note: in definition of macro ‘read_poll_timeout’
   47 |                 if (cond) \
      |                     ^~~~
drivers/accel/amdxdna/aie2_psp.c:61:15: note: in expansion of macro ‘readx_poll_timeout’    61 |         ret = readx_poll_timeout(readl, PSP_REG(psp, PSP_STATUS_REG), ready,
      |               ^~~~~~~~~~~~~~~~~~
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit declaration of function ‘FIELD_PREP’ [-Werror=implicit-function-declaration]
  121 |         cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
      |                        ^~~~~~~~~~
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit declaration of function ‘FIELD_PREP’ [-Werror=implicit-function-declaration]
  121 |         cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
      |                        ^~~~~~~~~~
In file included from drivers/accel/amdxdna/aie2_pci.c:22:
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit declaration of function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
  112 |         return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
      |                ^~~~~~~~~
In file included from drivers/accel/amdxdna/aie2_ctx.c:18:
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit declaration of function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
  112 |         return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
      |                ^~~~~~~~~
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit declaration of function ‘FIELD_PREP’ [-Werror=implicit-function-declaration]
  121 |         cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
      |                        ^~~~~~~~~~
In file included from drivers/accel/amdxdna/amdxdna_ctx.c:16:
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit declaration of function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
  112 |         return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
      |                ^~~~~~~~~
cc1: all warnings being treated as errors
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit declaration of function ‘FIELD_PREP’ [-Werror=implicit-function-declaration]
  121 |         cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
      |                        ^~~~~~~~~~
drivers/accel/amdxdna/aie2_ctx.c: In function ‘aie2_hwctx_restart’:
drivers/accel/amdxdna/aie2_ctx.c:114:9: error: too few arguments to function ‘drm_sched_start’
  114 |         drm_sched_start(&hwctx->priv->sched);
      |         ^~~~~~~~~~~~~~~
In file included from ./include/trace/events/amdxdna.h:12,
                 from drivers/accel/amdxdna/aie2_ctx.c:13:
./include/drm/gpu_scheduler.h:593:6: note: declared here
  593 | void drm_sched_start(struct drm_gpu_scheduler *sched, int errno);
      |      ^~~~~~~~~~~~~~~
make[5]: *** [scripts/Makefile.build:229: drivers/accel/amdxdna/aie2_psp.o] Error 1
make[5]: *** Waiting for unfinished jobs....
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit declaration of function ‘FIELD_PREP’ [-Werror=implicit-function-declaration]
  121 |         cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
      |                        ^~~~~~~~~~
In file included from drivers/accel/amdxdna/amdxdna_pci_drv.c:18:
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit declaration of function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
  112 |         return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
      |                ^~~~~~~~~
cc1: all warnings being treated as errors
make[5]: *** [scripts/Makefile.build:229: drivers/accel/amdxdna/aie2_ctx.o] Error 1
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit declaration of function ‘FIELD_PREP’ [-Werror=implicit-function-declaration]
  121 |         cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
      |                        ^~~~~~~~~~
drivers/accel/amdxdna/amdxdna_mailbox.c: In function ‘xdna_mailbox_send_msg’: drivers/accel/amdxdna/amdxdna_mailbox.c:444:26: error: implicit declaration of function ‘FIELD_PREP’ [-Werror=implicit-function-declaration]   444 |         header->sz_ver = FIELD_PREP(MSG_BODY_SZ, msg->send_size) |
      |                          ^~~~~~~~~~


You also have the following checkpatch issues -

Could you share the command you used?  I tried to use 'dim checkpatch' and it did not find out the misspelling issue.


Thanks,

Lizhi



WARNING: 'Disalbe' may be misspelled - perhaps 'Disable'?
#1646: FILE: drivers/accel/amdxdna/amdxdna_mailbox.c:553:
+       /* Disalbe an irq and wait. This might sleep. */
           ^^^^^^^

WARNING: 'splite' may be misspelled - perhaps 'split'?
#1695: FILE: drivers/accel/amdxdna/amdxdna_mailbox.h:21:
+ * The mailbox will splite the sending data in to multiple firmware message if
                     ^^^^^^

WARNING: 'miliseconds' may be misspelled - perhaps 'milliseconds'?
#1875: FILE: drivers/accel/amdxdna/amdxdna_mailbox_helper.h:9:
+#define TX_TIMEOUT 2000 /* miliseconds */
                            ^^^^^^^^^^^

WARNING: 'miliseconds' may be misspelled - perhaps 'milliseconds'?
#1876: FILE: drivers/accel/amdxdna/amdxdna_mailbox_helper.h:10:
+#define RX_TIMEOUT 5000 /* miliseconds */
                            ^^^^^^^^^^^

total: 0 errors, 4 warnings, 0 checks, 1916 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

0003-accel-amdxdna-Support-hardware-mailbox.patch has style problems, please review.



0007-accel-amdxdna-Add-command-execution.patch
----------------------------------------------
WARNING: 'miliseconds' may be misspelled - perhaps 'milliseconds'?
#59: FILE: drivers/accel/amdxdna/aie2_ctx.c:27:
+#define HWCTX_MAX_TIMEOUT      60000 /* miliseconds */
                                         ^^^^^^^^^^^

WARNING: 'reverve' may be misspelled - perhaps 'reserve'?
#612: FILE: drivers/accel/amdxdna/aie2_ctx.c:779:
+               XDNA_WARN(xdna, "Failed to reverve fence, ret %d", ret);
                                           ^^^^^^^

WARNING: 'Exectuion' may be misspelled - perhaps 'Execution'?
#1899: FILE: drivers/accel/amdxdna/amdxdna_pci_drv.c:139:
+       /* Exectuion */
           ^^^^^^^^^

WARNING: 'exectuion' may be misspelled - perhaps 'execution'?
#2113: FILE: include/uapi/drm/amdxdna_accel.h:239:
+ * struct amdxdna_drm_wait_cmd - Wait exectuion command.
                                       ^^^^^^^^^

total: 0 errors, 10 warnings, 0 checks, 1983 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

0007-accel-amdxdna-Add-command-execution.patch has style problems, please review.


0008-accel-amdxdna-Add-suspend-and-resume.patch
-----------------------------------------------
WARNING: 'miliseconds' may be misspelled - perhaps 'milliseconds'?
#163: FILE: drivers/accel/amdxdna/amdxdna_pci_drv.c:22:
+#define AMDXDNA_AUTOSUSPEND_DELAY      5000 /* miliseconds */
                                                ^^^^^^^^^^^

total: 0 errors, 1 warnings, 0 checks, 302 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

0008-accel-amdxdna-Add-suspend-and-resume.patch has style problems, please review.


-Jeff



[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux