Am 08.09.21 um 11:29 schrieb Lazar, Lijo:
On 9/8/2021 2:32 PM, Yu, Lang wrote:
[AMD Official Use Only]
-----Original Message-----
From: Lazar, Lijo <Lijo.Lazar@xxxxxxx>
Sent: Wednesday, September 8, 2021 4:55 PM
To: Yu, Lang <Lang.Yu@xxxxxxx>; Christian König
<ckoenig.leichtzumerken@xxxxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Huang, Ray
<Ray.Huang@xxxxxxx>; Tian Tao <tiantao6@xxxxxxxxxxxxx>
Subject: Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings
On 9/8/2021 1:14 PM, Yu, Lang wrote:
[AMD Official Use Only]
-----Original Message-----
From: Lazar, Lijo <Lijo.Lazar@xxxxxxx>
Sent: Wednesday, September 8, 2021 3:36 PM
To: Christian König <ckoenig.leichtzumerken@xxxxxxxxx>; Yu, Lang
<Lang.Yu@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Huang, Ray
<Ray.Huang@xxxxxxx>; Tian Tao <tiantao6@xxxxxxxxxxxxx>
Subject: Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at
warnings
On 9/8/2021 12:07 PM, Christian König wrote:
Am 08.09.21 um 07:56 schrieb Lang Yu:
sysfs_emit and sysfs_emit_at requrie a page boundary aligned buf
address. Make them happy!
Warning Log:
[ 492.545174] invalid sysfs_emit_at: buf:00000000f19bdfde at:0 [
492.546416] WARNING: CPU: 7 PID: 1304 at fs/sysfs/file.c:765
sysfs_emit_at+0x4a/0xa0
[ 492.654805] Call Trace:
[ 492.655353] ? smu_cmn_get_metrics_table+0x40/0x50 [amdgpu] [
492.656780] vangogh_print_clk_levels+0x369/0x410 [amdgpu] [
492.658245] vangogh_common_print_clk_levels+0x77/0x80 [amdgpu] [
492.659733] ? preempt_schedule_common+0x18/0x30 [ 492.660713]
smu_print_ppclk_levels+0x65/0x90 [amdgpu] [ 492.662107]
amdgpu_get_pp_od_clk_voltage+0x13d/0x190 [amdgpu] [ 492.663620]
dev_attr_show+0x1d/0x40
Mhm, that at least partially doesn't looks like the right
approach to me.
Why do we have string printing and sysfs code in the hardware
version specific backend in the first place?
This is a callback meant for printing ASIC specific information to
sysfs node. The buffer passed in sysfs read is passed as it is to
the callback API.
That stuff needs to be implemented for each hardware generation and
is now cluttered with sysfs buffer offset calculations.
Looks like the warning happened because of this usage.
size = amdgpu_dpm_print_clock_levels(adev,
OD_SCLK, buf);
size += amdgpu_dpm_print_clock_levels(adev,
OD_MCLK,
buf+size);
size += amdgpu_dpm_print_clock_levels(adev,
OD_VDDC_CURVE, buf+size);
size += amdgpu_dpm_print_clock_levels(adev,
OD_VDDGFX_OFFSET, buf+size);
size += amdgpu_dpm_print_clock_levels(adev,
OD_RANGE,
buf+size);
size += amdgpu_dpm_print_clock_levels(adev,
OD_CCLK,
buf+size);
[Yu, Lang]
Yes. So it is fine we just fix the caller
amdgpu_get_pp_od_clk_voltage like
following:
static ssize_t amdgpu_get_pp_od_clk_voltage(struct device *dev,
struct device_attribute *attr,
char *buf)
{
struct drm_device *ddev = dev_get_drvdata(dev);
struct amdgpu_device *adev = drm_to_adev(ddev);
ssize_t size, offset;
int ret, i;
char temp_buf[512];
char clock_type[] = {OD_SCLK, OD_MCLK, OD_VDDC_CURVE,
OD_VDDGFX_OFFSET, OD_RANGE, OD_CCLK};
if (amdgpu_in_reset(adev))
return -EPERM;
if (adev->in_suspend && !adev->in_runpm)
return -EPERM;
ret = pm_runtime_get_sync(ddev->dev);
if (ret < 0) {
pm_runtime_put_autosuspend(ddev->dev);
return ret;
}
offset = 0;
if (adev->powerplay.pp_funcs->print_clock_levels) {
for (i = 0; i < ARRAY_SIZE(clock_type); i++) {
size = amdgpu_dpm_print_clock_levels(adev,
clock_type[i], buf);
if (offset + size > PAGE_SIZE)
break;
memcpy(temp_buf + offset, buf, size);
offset += size;
}
memcpy(buf, temp_buf, offset);
size = offset;
} else {
size = sysfs_emit(buf, "\n");
}
pm_runtime_mark_last_busy(ddev->dev);
pm_runtime_put_autosuspend(ddev->dev);
return size;
}
Prefer to avoid any extra stack or heap usage for buffer. Maybe
another arg to
pass offset along with buf?
[Yu, Lang]
Actually, the buf address contains the offset(offset_in_page(buf)) .
Though it's not a problem based on codeflow, static analysis tools
might complain.
Or we just rollback to sprintf/snprintf.
snprintf with (PAGE_SIZE-size) may be simpler. I think Darren took the
effort to convert these, he may have some other ideas.
This is not what I meant. See from the design point of view the
print_clock_levels() callback is the bad idea to begin with.
What we should have instead is a callback which returns the clock as a
value which is then printed in the amdgpu_get_pp_od_clk_voltage() function.
This avoids passing around the buffer and remaining size everywhere and
also guarantees that the sysfs have unified printing over all hardware
generations.
Regards,
Christian.