RE: [PATCH] drm/amdgpu: Higher log level for missing PCIe atomics caps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 18, 2025 5:35 AM Felix Kuehling wrote:
> On 2025-03-17 15:07, Deucher, Alexander wrote:
> > [Public]
> >
> >> -----Original Message-----
> >> From: Daisuke Matsuda <matsuda-daisuke@xxxxxxxxxxx>
> >> Sent: Thursday, March 13, 2025 9:18 PM
> >> To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx; dri-devel@xxxxxxxxxxxxxxxxxxxxx; Deucher,
> >> Alexander <Alexander.Deucher@xxxxxxx>; Koenig, Christian
> >> <Christian.Koenig@xxxxxxx>
> >> Cc: airlied@xxxxxxxxx; simona@xxxxxxxx; Daisuke Matsuda <matsuda-
> >> daisuke@xxxxxxxxxxx>
> >> Subject: [PATCH] drm/amdgpu: Higher log level for missing PCIe atomics caps
> >>
> >> Currently, ROCm requires CPUs that support PCIe atomics. The message is more
> >> urgent for GPGPU users, meaning basic functionalities of ROCm are not available
> >> on the node.
> >>
> > + Felix
> >
> > My understanding is that PCIe atomics are not strictly required, but there are some features that are not available
> without them.  Warning seems a bit overkill and potentially confusing to users that have an existing system that
> otherwise works fine.
> 
> I could see either argument. The GPU is capable of PCIe atomics, but the system configuration (chipset, PCIe switch, etc.)
> is preventing them from working, which has an impact on some application-visible functionality. A naive application that
> depends on atomics and is not carefully written to provide fallbacks or fail gracefully in case atomics are unavailable,
> would fail silently or produce incorrect results. So maybe that justifies a warning message.

The Linux kernel commonly uses "warn level" messages to alert users when an incompatibility restricts functionality, and this approach aligns with that practice.
To address the concern about the message being too general, how about making it more specific like this:
  dev_warn(adev->dev, "PCIe atomic ops are not supported.  ROCm performance and functionality may be limited.\n");
This is more direct and focused on the impact to ROCm users.

Regards,
Daisuke

> 
> Regards,
>   Felix
> 
> 
> >
> > Alex
> >
> >
> >> Signed-off-by: Daisuke Matsuda <matsuda-daisuke@xxxxxxxxxxx>
> >> ---
> >>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  | 4 ++--
> >> drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h | 2 +-
> >>  2 files changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> index 018dfccd771b..faeef136e272 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> @@ -4374,7 +4374,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> >>                       return r;
> >>       }
> >>
> >> -     /* enable PCIE atomic ops */
> >> +     /* enable PCIe atomic ops */
> >>       if (amdgpu_sriov_vf(adev)) {
> >>               if (adev->virt.fw_reserve.p_pf2vf)
> >>                       adev->have_atomics_support = ((struct
> >> amd_sriov_msg_pf2vf_info *) @@ -4395,7 +4395,7 @@ int
> >> amdgpu_device_init(struct amdgpu_device *adev,
> >>       }
> >>
> >>       if (!adev->have_atomics_support)
> >> -             dev_info(adev->dev, "PCIE atomic ops is not supported\n");
> >> +             dev_warn(adev->dev, "PCIe atomic ops are not supported\n");
> >>
> >>       /* doorbell bar mapping and doorbell index init*/
> >>       amdgpu_doorbell_init(adev);
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
> >> b/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
> >> index b4f9c2f4e92c..c52605a07597 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
> >> @@ -240,7 +240,7 @@ struct amd_sriov_msg_pf2vf_info {
> >>       } mm_bw_management[AMD_SRIOV_MSG_RESERVE_VCN_INST];
> >>       /* UUID info */
> >>       struct amd_sriov_msg_uuid_info uuid_info;
> >> -     /* PCIE atomic ops support flag */
> >> +     /* PCIe atomic ops support flag */
> >>       uint32_t pcie_atomic_ops_support_flags;
> >>       /* Portion of GPU memory occupied by VF.  MAX value is 65535, but set to
> >> uint32_t to maintain alignment with reserved size */
> >>       uint32_t gpu_capacity;
> >> --
> >> 2.39.1




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux