[AMD Official Use Only] > -----Original Message----- > From: Sharma, Shashank <Shashank.Sharma@xxxxxxx> > Sent: Tuesday, March 8, 2022 12:09 > To: Limonciello, Mario <Mario.Limonciello@xxxxxxx>; Grodzovsky, Andrey > <Andrey.Grodzovsky@xxxxxxx>; Lazar, Lijo <Lijo.Lazar@xxxxxxx>; Shashank > Sharma <contactshashanksharma@xxxxxxxxx>; amd- > gfx@xxxxxxxxxxxxxxxxxxxxx; S-k, Shyam-sundar <Shyam-sundar.S-k@xxxxxxx>; > Goswami, Sanket <Sanket.Goswami@xxxxxxx> > Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Somalapuram, > Amaranath <Amaranath.Somalapuram@xxxxxxx>; Koenig, Christian > <Christian.Koenig@xxxxxxx> > Subject: Re: [PATCH 1/2] drm: Add GPU reset sysfs event > > Hey Mario, > > On 3/8/2022 6:27 PM, Limonciello, Mario wrote: > > On 3/8/2022 10:57, Sharma, Shashank wrote: > >> > >> > >> On 3/8/2022 5:55 PM, Andrey Grodzovsky wrote: > >>> You can read on their side here - > >>> https://www.phoronix.com/scan.php?page=news_item&px=AMD-STB- > Linux-5.17 and > >>> see their patch. THey don't have as clean > >>> interface as we do to retrieve the buffer and currently it's > >>> hard-coded for debugfs dump but it looks like pretty straight forward > >>> to expose their buffer to external > >>> client like amdgpu. > > >> > >> Noted, thanks for the pointer. > >> - Shashank > > > > Yeah I think this is a great idea for APU, but need to keep in mind > > amd-pmc is only activated if APU is configured for s0i3. So in the > > IS_APU check you will need to test for set for s0i3 and return an error > > code if not set. > > > > For APU it's currently fetched on demand from debugfs. If you can > > "easily" export a symbol I say go for it and include a patch in your > > series. > > Yep, if the info is available via debugfs, I don't see a reason why > can't we fetch that directly from its lower layers. May I know the name > of this debugfs entry ? It's called "stb_read" from debugfs. > > - Shashank > > If not my suggestion is to stub this out and let Shyam and > > Sanket fill in the stub after they can sort out the exporting it to > > another driver. > > > >>> > >>> Andrey > >>> > >>> On 2022-03-08 11:46, Sharma, Shashank wrote: > >>>> I have a very limited understanding of PMC driver and its > >>>> interfaces, so I would just go ahead and rely on Andrey's > >>>> judgement/recommendation on this :) > >>>> > >>>> - Shashank > >>>> > >>>> On 3/8/2022 5:39 PM, Andrey Grodzovsky wrote: > >>>>> As long as PMC driver provides clear interface to retrieve the info > >>>>> there should be no issue to call either amdgpu interface or PMC > >>>>> interface using IS_APU (or something alike in the code) > >>>>> We probably should add a wrapper function around this logic in amdgpu. > >>>>> > >>>>> Andrey > >>>>> > >>>>> On 2022-03-08 11:36, Lazar, Lijo wrote: > >>>>>> > >>>>>> [AMD Official Use Only] > >>>>>> > >>>>>> > >>>>>> +Mario > >>>>>> > >>>>>> I guess that means the functionality needs to be present in amdgpu > >>>>>> for APUs also. Presently, this is taken care by PMC driver for APUs. > >>>>>> > >>>>>> Thanks, > >>>>>> Lijo > >>>>>> ------------------------------------------------------------------------ > >>>>>> > >>>>>> *From:* amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> on behalf > >>>>>> of Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx> > >>>>>> *Sent:* Tuesday, March 8, 2022 9:55:03 PM > >>>>>> *To:* Shashank Sharma <contactshashanksharma@xxxxxxxxx>; > >>>>>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx <amd-gfx@xxxxxxxxxxxxxxxxxxxxx> > >>>>>> *Cc:* Deucher, Alexander <Alexander.Deucher@xxxxxxx>; > Somalapuram, > >>>>>> Amaranath <Amaranath.Somalapuram@xxxxxxx>; Koenig, Christian > >>>>>> <Christian.Koenig@xxxxxxx>; Sharma, Shashank > >>>>>> <Shashank.Sharma@xxxxxxx> > >>>>>> *Subject:* Re: [PATCH 1/2] drm: Add GPU reset sysfs event > >>>>>> > >>>>>> On 2022-03-07 11:26, Shashank Sharma wrote: > >>>>>> > From: Shashank Sharma <shashank.sharma@xxxxxxx> > >>>>>> > > >>>>>> > This patch adds a new sysfs event, which will indicate > >>>>>> > the userland about a GPU reset, and can also provide > >>>>>> > some information like: > >>>>>> > - which PID was involved in the GPU reset > >>>>>> > - what was the GPU status (using flags) > >>>>>> > > >>>>>> > This patch also introduces the first flag of the flags > >>>>>> > bitmap, which can be appended as and when required. > >>>>>> > >>>>>> > >>>>>> I am reminding again about another important piece of info which > >>>>>> you can add > >>>>>> here and that is Smart Trace Buffer dump [1]. The buffer size is HW > >>>>>> specific but > >>>>>> from what I see there is no problem to just amend it as part of > >>>>>> envp[] > >>>>>> initialization. > >>>>>> bellow. > >>>>>> > >>>>>> The interface to get the buffer is smu_stb_collect_info and usage > >>>>>> can be > >>>>>> seen from > >>>>>> frebugfs interface in smu_stb_debugfs_open > >>>>>> > >>>>>> [1] - > >>>>>> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.sp > inics.net%2Flists%2Famd- > gfx%2Fmsg70751.html&data=04%7C01%7Clijo.lazar%40amd.com%7C80bc > 3f07e2d0441d44a108da012036dc%7C3dd8961fe4884e608e11a82d994e183d%7 > C0%7C0%7C637823535167679490%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC > 4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&a > mp;sdata=53l7KlTf%2BICKkZkLVwFh6nRTjkAh%2FDpOat5DRoyKIx0%3D&re > served=0 > >>>>>> > <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.s > pinics.net%2Flists%2Famd- > gfx%2Fmsg70751.html&data=04%7C01%7Clijo.lazar%40amd.com%7C80bc > 3f07e2d0441d44a108da012036dc%7C3dd8961fe4884e608e11a82d994e183d%7 > C0%7C0%7C637823535167679490%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC > 4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&a > mp;sdata=53l7KlTf%2BICKkZkLVwFh6nRTjkAh%2FDpOat5DRoyKIx0%3D&re > served=0> > >>>>>> > >>>>>> > >>>>>> Andrey > >>>>>> > >>>>>> > >>>>>> > > >>>>>> > Cc: Alexandar Deucher <alexander.deucher@xxxxxxx> > >>>>>> > Cc: Christian Koenig <christian.koenig@xxxxxxx> > >>>>>> > Signed-off-by: Shashank Sharma <shashank.sharma@xxxxxxx> > >>>>>> > --- > >>>>>> > drivers/gpu/drm/drm_sysfs.c | 24 ++++++++++++++++++++++++ > >>>>>> > include/drm/drm_sysfs.h | 3 +++ > >>>>>> > 2 files changed, 27 insertions(+) > >>>>>> > > >>>>>> > diff --git a/drivers/gpu/drm/drm_sysfs.c > >>>>>> b/drivers/gpu/drm/drm_sysfs.c > >>>>>> > index 430e00b16eec..52a015161431 100644 > >>>>>> > --- a/drivers/gpu/drm/drm_sysfs.c > >>>>>> > +++ b/drivers/gpu/drm/drm_sysfs.c > >>>>>> > @@ -409,6 +409,30 @@ void drm_sysfs_hotplug_event(struct > >>>>>> drm_device *dev) > >>>>>> > } > >>>>>> > EXPORT_SYMBOL(drm_sysfs_hotplug_event); > >>>>>> > > >>>>>> > +/** > >>>>>> > + * drm_sysfs_reset_event - generate a DRM uevent to indicate > >>>>>> GPU reset > >>>>>> > + * @dev: DRM device > >>>>>> > + * @pid: The process ID involve with the reset > >>>>>> > + * @flags: Any other information about the GPU status > >>>>>> > + * > >>>>>> > + * Send a uevent for the DRM device specified by @dev. This > >>>>>> indicates > >>>>>> > + * user that a GPU reset has occurred, so that the interested > >>>>>> client > >>>>>> > + * can take any recovery or profiling measure, when required. > >>>>>> > + */ > >>>>>> > +void drm_sysfs_reset_event(struct drm_device *dev, uint64_t > >>>>>> pid, uint32_t flags) > >>>>>> > +{ > >>>>>> > + unsigned char pid_str[21], flags_str[15]; > >>>>>> > + unsigned char reset_str[] = "RESET=1"; > >>>>>> > + char *envp[] = { reset_str, pid_str, flags_str, NULL }; > >>>>>> > + > >>>>>> > + DRM_DEBUG("generating reset event\n"); > >>>>>> > + > >>>>>> > + snprintf(pid_str, ARRAY_SIZE(pid_str), "PID=%lu", pid); > >>>>>> > + snprintf(flags_str, ARRAY_SIZE(flags_str), "FLAGS=%u", > >>>>>> flags); > >>>>>> > + kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, > envp); > >>>>>> > +} > >>>>>> > +EXPORT_SYMBOL(drm_sysfs_reset_event); > >>>>>> > + > >>>>>> > /** > >>>>>> > * drm_sysfs_connector_hotplug_event - generate a DRM uevent > >>>>>> for any connector > >>>>>> > * change > >>>>>> > diff --git a/include/drm/drm_sysfs.h b/include/drm/drm_sysfs.h > >>>>>> > index 6273cac44e47..63f00fe8054c 100644 > >>>>>> > --- a/include/drm/drm_sysfs.h > >>>>>> > +++ b/include/drm/drm_sysfs.h > >>>>>> > @@ -2,6 +2,8 @@ > >>>>>> > #ifndef _DRM_SYSFS_H_ > >>>>>> > #define _DRM_SYSFS_H_ > >>>>>> > > >>>>>> > +#define DRM_GPU_RESET_FLAG_VRAM_VALID (1 << 0) > >>>>>> > + > >>>>>> > struct drm_device; > >>>>>> > struct device; > >>>>>> > struct drm_connector; > >>>>>> > @@ -11,6 +13,7 @@ int drm_class_device_register(struct device > >>>>>> *dev); > >>>>>> > void drm_class_device_unregister(struct device *dev); > >>>>>> > > >>>>>> > void drm_sysfs_hotplug_event(struct drm_device *dev); > >>>>>> > +void drm_sysfs_reset_event(struct drm_device *dev, uint64_t > >>>>>> pid, uint32_t reset_flags); > >>>>>> > void drm_sysfs_connector_hotplug_event(struct drm_connector > >>>>>> *connector); > >>>>>> > void drm_sysfs_connector_status_event(struct drm_connector > >>>>>> *connector, > >>>>>> > struct drm_property > >>>>>> *property); > >