On 8/14/24 14:53, Jeffrey Hugo wrote:
On 8/14/2024 2:24 PM, Lizhi Hou wrote:
On 8/14/24 11:46, Jeffrey Hugo wrote:
On 8/14/2024 12:16 PM, Lizhi Hou wrote:
On 8/9/24 09:11, Jeffrey Hugo wrote:
On 8/5/2024 11:39 AM, Lizhi Hou wrote:
diff --git a/drivers/accel/amdxdna/aie2_pci.c
b/drivers/accel/amdxdna/aie2_pci.c
new file mode 100644
index 000000000000..3660967c00e6
--- /dev/null
+++ b/drivers/accel/amdxdna/aie2_pci.c
@@ -0,0 +1,182 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2023-2024, Advanced Micro Devices, Inc.
+ */
+
+#include <linux/amd-iommu.h>
+#include <linux/errno.h>
+#include <linux/firmware.h>
+#include <linux/iommu.h>
You are clearly missing linux/pci.h and I suspect many more.
Other headers are indirectly included by "aie2_pci.h" underneath.
aie2_pci.h also does not directly include linux/pci.h
it is aie2_pci.h --> amdxdna_pci_drv.h --> linux/pci.h.
It compiles without any issue.
I did not doubt that it compiled. To be clear, I'm pointing out what
I believe is poor style - relying on implicit includes.
There is no reason that amdxdna_pci_drv.h needs to include linux/pci.h
(at-least in this patch). That header file doesn't use any of the
content from pci.h. If this were merged, I suspect it would be valid
for someone to post a cleanup patch that removes pci.h from
amdxdna_pci_drv.h.
The problem comes when someone does a tree wide refactor of some
header - perhaps moving a function out of pci.h into something else.
If they grep the source tree, they'll find amdxdna_pci_drv.h includes
pci.h but really doesn't use it. They likely won't see aie2_pci.c
which may break because of the refactor. Even more problematic is if
pci.h is including something that you need, and you are not including
it anywhere. The code will still compile, but maybe in the next kernel
cycle pci.h no longer includes that thing. Your code will break.
The 4 includes you have here seems entirely too little, and I'm not
clearly seeing the logic of what gets explicitly included vs what is
implicitly included. firmware.h is explicitly included, but pci.h is
not, yet it seems like you use a lot more from pci.h.
There is the include what you use project that attempts to automate
this, although I don't know how well it works with kernel code -
https://github.com/include-what-you-use/include-what-you-use
Ok. got your point and I will cleanup include.
Lizhi
+
+ ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64));
+ if (ret) {
+ XDNA_ERR(xdna, "Failed to set DMA mask: %d", ret);
+ goto release_fw;
+ }
+
+ nvec = pci_msix_vec_count(pdev);
This feels weird. Can your device advertise variable number of
MSI-X vectors? It only works if all of the vectors are used?
That is possible. the driver supports different hardware. And the
fw assigns vector for hardware context dynamically. So the driver
needs to allocate all vectors ahead.
So, if the device requests N MSIs, but the host is only able to
satisfy 1 (or some number less than N), the fw is completely unable
to function?
The fw may return interrupt 2 is assigned to hardware context. Then
the driver may not deal with it in this case. I think it is ok to
fail if the system has very limited resource.
Ok. I suspect you'll want to change that behavior in the future with
a fw update, but if this is how things work today, then this is how
the driver must be.