Re: [PATCH v6 05/45] drm/amd: Add a new helper for loading/validating microcode

Mario Limonciello <mario.limonciello@xxxxxxx> · Wed, 4 Jan 2023 23:37:35 -0600

On 1/4/23 23:29, Lazar, Lijo wrote:


On 1/5/2023 10:53 AM, Mario Limonciello wrote:
On 1/4/23 23:07, Lazar, Lijo wrote:


On 1/5/2023 9:12 AM, Mario Limonciello wrote:
All microcode runs a basic validation after it's been loaded. Each
IP block as part of init will run both.

Introduce a wrapper for request_firmware and amdgpu_ucode_validate.
This wrapper will also remap any error codes from request_firmware
to -ENODEV.  This is so that early_init will fail if firmware couldn't
be loaded instead of the IP block being disabled.

Signed-off-by: Mario Limonciello <mario.limonciello@xxxxxxx>
---
v5->v6:
  * Fix argument to be ** not *
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 36 
+++++++++++++++++++++++
  drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h |  3 ++
  2 files changed, 39 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
index eafcddce58d3..8ebfec12da87 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
@@ -1312,3 +1312,39 @@ void amdgpu_ucode_ip_version_decode(struct 
amdgpu_device *adev, int block_type,
      snprintf(ucode_prefix, len, "%s_%d_%d_%d", ip_name, maj, min, 
rev);
  }
+
+/*
+ * amdgpu_ucode_request - Fetch and validate amdgpu microcode
+ *
+ * @adev: amdgpu device
+ * @fw: pointer to load firmware to
+ * @fw_name: firmware to load
+ *
+ * This is a helper that will use request_firmware and 
amdgpu_ucode_validate
+ * to load and run basic validation on firmware. If the load fails, 
remap
+ * the error code to -ENODEV, so that early_init functions will 
fail to load.
+ */
+int amdgpu_ucode_request(struct amdgpu_device *adev, const struct 
firmware **fw,
+             const char *fw_name)
+{
+    int err = request_firmware(fw, fw_name, adev->dev);
+
+    if (err)
+        return -ENODEV;
+    err = amdgpu_ucode_validate(*fw);
+    if (err)
+        dev_dbg(adev->dev, "\"%s\" failed to validate\n", fw_name);
+

Missed this earlier. If validate fails, shouldn't this undo the 
request operation by calling release?

Actually that was original design, but there is one place in the 
codebase that expects that ucode validation can fail, and so leave the 
evaluate of error code and cleanup outside of helper.


I see. Does request_firmware assure that fw pointer be always NULL if it 
fails? Or should that be done here if request_ fails? In subsequent 
patches, I see clients calling release without checking what caused the 
failure.

Yes, according to docs 
(https://docs.kernel.org/next/driver-api/firmware/request_firmware.html#request-firmware-api-expected-driver-use)


Thanks,
Lijo


Thanks,
Lijo

+    return err;
+}
+
+/*
+ * amdgpu_ucode_release - Release firmware microcode
+ *
+ * @fw: pointer to firmware to release
+ */
+void amdgpu_ucode_release(const struct firmware **fw)
+{
+    release_firmware(*fw);
+    *fw = NULL;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h
index 552e06929229..848579d4988b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h
@@ -544,6 +544,9 @@ void amdgpu_ucode_print_sdma_hdr(const struct 
common_firmware_header *hdr);
  void amdgpu_ucode_print_psp_hdr(const struct 
common_firmware_header *hdr);
  void amdgpu_ucode_print_gpu_info_hdr(const struct 
common_firmware_header *hdr);
  int amdgpu_ucode_validate(const struct firmware *fw);
+int amdgpu_ucode_request(struct amdgpu_device *adev, const struct 
firmware **fw,
+             const char *fw_name);
+void amdgpu_ucode_release(const struct firmware **fw);
  bool amdgpu_ucode_hdr_version(union amdgpu_firmware_header *hdr,
                  uint16_t hdr_major, uint16_t hdr_minor);