Re: [PATCH 1/2] drm: Add GPU reset sysfs event

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 3/8/2022 5:55 PM, Andrey Grodzovsky wrote:
You can read on their side here - https://www.phoronix.com/scan.php?page=news_item&px=AMD-STB-Linux-5.17 and see their patch. THey don't have as clean interface as we do to retrieve the buffer and currently it's hard-coded for debugfs dump but it looks like pretty straight forward to expose their buffer to external
client like amdgpu.

Noted, thanks for the pointer.
- Shashank

Andrey

On 2022-03-08 11:46, Sharma, Shashank wrote:
I have a very limited understanding of PMC driver and its interfaces, so I would just go ahead and rely on Andrey's judgement/recommendation on this :)

- Shashank

On 3/8/2022 5:39 PM, Andrey Grodzovsky wrote:
As long as PMC driver provides clear interface to retrieve the info there should be no issue to call either amdgpu interface or PMC interface using IS_APU (or something alike in the code)
We probably should add a wrapper function around this logic in amdgpu.

Andrey

On 2022-03-08 11:36, Lazar, Lijo wrote:

[AMD Official Use Only]


+Mario

I guess that means the functionality needs to be present in amdgpu for APUs also. Presently, this is taken care by PMC driver for APUs.

Thanks,
Lijo
------------------------------------------------------------------------ *From:* amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> on behalf of Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx>
*Sent:* Tuesday, March 8, 2022 9:55:03 PM
*To:* Shashank Sharma <contactshashanksharma@xxxxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx <amd-gfx@xxxxxxxxxxxxxxxxxxxxx> *Cc:* Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Somalapuram, Amaranath <Amaranath.Somalapuram@xxxxxxx>; Koenig, Christian <Christian.Koenig@xxxxxxx>; Sharma, Shashank <Shashank.Sharma@xxxxxxx>
*Subject:* Re: [PATCH 1/2] drm: Add GPU reset sysfs event

On 2022-03-07 11:26, Shashank Sharma wrote:
> From: Shashank Sharma <shashank.sharma@xxxxxxx>
>
> This patch adds a new sysfs event, which will indicate
> the userland about a GPU reset, and can also provide
> some information like:
> - which PID was involved in the GPU reset
> - what was the GPU status (using flags)
>
> This patch also introduces the first flag of the flags
> bitmap, which can be appended as and when required.


I am reminding again about another important piece of info which you can add
here and that is Smart Trace Buffer dump [1]. The buffer size is HW
specific but
from what I see there is no problem to just amend it as part of envp[]
initialization.
bellow.

The interface to get the buffer is smu_stb_collect_info and usage can be
seen from
frebugfs interface in smu_stb_debugfs_open

[1] - https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg70751.html&amp;data=04%7C01%7Clijo.lazar%40amd.com%7C80bc3f07e2d0441d44a108da012036dc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637823535167679490%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=53l7KlTf%2BICKkZkLVwFh6nRTjkAh%2FDpOat5DRoyKIx0%3D&amp;reserved=0 <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg70751.html&amp;data=04%7C01%7Clijo.lazar%40amd.com%7C80bc3f07e2d0441d44a108da012036dc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637823535167679490%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=53l7KlTf%2BICKkZkLVwFh6nRTjkAh%2FDpOat5DRoyKIx0%3D&amp;reserved=0>

Andrey


>
> Cc: Alexandar Deucher <alexander.deucher@xxxxxxx>
> Cc: Christian Koenig <christian.koenig@xxxxxxx>
> Signed-off-by: Shashank Sharma <shashank.sharma@xxxxxxx>
> ---
>   drivers/gpu/drm/drm_sysfs.c | 24 ++++++++++++++++++++++++
>   include/drm/drm_sysfs.h     |  3 +++
>   2 files changed, 27 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_sysfs.c b/drivers/gpu/drm/drm_sysfs.c
> index 430e00b16eec..52a015161431 100644
> --- a/drivers/gpu/drm/drm_sysfs.c
> +++ b/drivers/gpu/drm/drm_sysfs.c
> @@ -409,6 +409,30 @@ void drm_sysfs_hotplug_event(struct drm_device *dev)
>   }
>   EXPORT_SYMBOL(drm_sysfs_hotplug_event);
>
> +/**
> + * drm_sysfs_reset_event - generate a DRM uevent to indicate GPU reset
> + * @dev: DRM device
> + * @pid: The process ID involve with the reset
> + * @flags: Any other information about the GPU status
> + *
> + * Send a uevent for the DRM device specified by @dev. This indicates
> + * user that a GPU reset has occurred, so that the interested client
> + * can take any recovery or profiling measure, when required.
> + */
> +void drm_sysfs_reset_event(struct drm_device *dev, uint64_t pid, uint32_t flags)
> +{
> +     unsigned char pid_str[21], flags_str[15];
> +     unsigned char reset_str[] = "RESET=1";
> +     char *envp[] = { reset_str, pid_str, flags_str, NULL };
> +
> +     DRM_DEBUG("generating reset event\n");
> +
> +     snprintf(pid_str, ARRAY_SIZE(pid_str), "PID=%lu", pid);
> +     snprintf(flags_str, ARRAY_SIZE(flags_str), "FLAGS=%u", flags);
> + kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, envp);
> +}
> +EXPORT_SYMBOL(drm_sysfs_reset_event);
> +
>   /**
>    * drm_sysfs_connector_hotplug_event - generate a DRM uevent for any connector
>    * change
> diff --git a/include/drm/drm_sysfs.h b/include/drm/drm_sysfs.h
> index 6273cac44e47..63f00fe8054c 100644
> --- a/include/drm/drm_sysfs.h
> +++ b/include/drm/drm_sysfs.h
> @@ -2,6 +2,8 @@
>   #ifndef _DRM_SYSFS_H_
>   #define _DRM_SYSFS_H_
>
> +#define DRM_GPU_RESET_FLAG_VRAM_VALID (1 << 0)
> +
>   struct drm_device;
>   struct device;
>   struct drm_connector;
> @@ -11,6 +13,7 @@ int drm_class_device_register(struct device *dev);
>   void drm_class_device_unregister(struct device *dev);
>
>   void drm_sysfs_hotplug_event(struct drm_device *dev);
> +void drm_sysfs_reset_event(struct drm_device *dev, uint64_t pid, uint32_t reset_flags); >   void drm_sysfs_connector_hotplug_event(struct drm_connector *connector); >   void drm_sysfs_connector_status_event(struct drm_connector *connector,
>                                      struct drm_property *property);



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux