Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Well that's complete nonsense, in that case we should rather go back to the s*printf functionality.

The problem is that we are not using the sysfs_emit/sysfs_emit_at as intended, e.g. something like this:

offs = sysfs_emit(page, "first...);
offs += sysfs_emit_at(page, offs, "second....);
offs += sysfs_emit_at(page, offs, "third....);
...

This is usually done inside the same function and not spread around in the driver.

Regards,
Christian.

Am 08.09.21 um 12:55 schrieb Yu, Lang:
[AMD Official Use Only]

Or try to send a patch to remove page boundary aligned limitation. Any considerations? Thanks.

int sysfs_emit(char *buf, const char *fmt, ...)
  {
         va_list args;
-       int len;
+       int len, offset;

-       if (WARN(!buf || offset_in_page(buf),
+       offset = offset_in_page(buf);
+
+       if (WARN(!buf,
                  "invalid sysfs_emit: buf:%p\n", buf))
                 return 0;

         va_start(args, fmt);
-       len = vscnprintf(buf, PAGE_SIZE, fmt, args);
+       len = vscnprintf(buf, PAGE_SIZE - offset, fmt, args);
         va_end(args);

         return len;
@@ -760,14 +762,16 @@ EXPORT_SYMBOL_GPL(sysfs_emit);
  int sysfs_emit_at(char *buf, int at, const char *fmt, ...)
  {
         va_list args;
-       int len;
+       int len, offset;
+
+       offset = offset_in_page(buf);

-       if (WARN(!buf || offset_in_page(buf) || at < 0 || at >= PAGE_SIZE,
+       if (WARN(!buf || at < 0 || at + offset >= PAGE_SIZE,
                  "invalid sysfs_emit_at: buf:%p at:%d\n", buf, at))
                 return 0;

         va_start(args, fmt);
-       len = vscnprintf(buf + at, PAGE_SIZE - at, fmt, args);
+       len = vscnprintf(buf + at, PAGE_SIZE - at - offset, fmt, args);
         va_end(args);

         return len;

Regards,
Lang

-----Original Message-----
From: Lazar, Lijo <Lijo.Lazar@xxxxxxx>
Sent: Wednesday, September 8, 2021 6:22 PM
To: Christian König <ckoenig.leichtzumerken@xxxxxxxxx>; Yu, Lang
<Lang.Yu@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Huang, Ray
<Ray.Huang@xxxxxxx>; Tian Tao <tiantao6@xxxxxxxxxxxxx>; Powell, Darren
<Darren.Powell@xxxxxxx>
Subject: Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings



On 9/8/2021 3:08 PM, Christian König wrote:
Am 08.09.21 um 11:29 schrieb Lazar, Lijo:

On 9/8/2021 2:32 PM, Yu, Lang wrote:
[AMD Official Use Only]



-----Original Message-----
From: Lazar, Lijo <Lijo.Lazar@xxxxxxx>
Sent: Wednesday, September 8, 2021 4:55 PM
To: Yu, Lang <Lang.Yu@xxxxxxx>; Christian König
<ckoenig.leichtzumerken@xxxxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Huang, Ray
<Ray.Huang@xxxxxxx>; Tian Tao <tiantao6@xxxxxxxxxxxxx>
Subject: Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at
warnings



On 9/8/2021 1:14 PM, Yu, Lang wrote:
[AMD Official Use Only]



-----Original Message-----
From: Lazar, Lijo <Lijo.Lazar@xxxxxxx>
Sent: Wednesday, September 8, 2021 3:36 PM
To: Christian König <ckoenig.leichtzumerken@xxxxxxxxx>; Yu, Lang
<Lang.Yu@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Huang, Ray
<Ray.Huang@xxxxxxx>; Tian Tao <tiantao6@xxxxxxxxxxxxx>
Subject: Re: [PATCH] drm/amdgpu: fix sysfs_emit/sysfs_emit_at
warnings



On 9/8/2021 12:07 PM, Christian König wrote:
Am 08.09.21 um 07:56 schrieb Lang Yu:
sysfs_emit and sysfs_emit_at requrie a page boundary aligned
buf address. Make them happy!

Warning Log:
[  492.545174] invalid sysfs_emit_at: buf:00000000f19bdfde at:0
[ 492.546416] WARNING: CPU: 7 PID: 1304 at fs/sysfs/file.c:765
sysfs_emit_at+0x4a/0xa0
[  492.654805] Call Trace:
[  492.655353]  ? smu_cmn_get_metrics_table+0x40/0x50 [amdgpu]
[ 492.656780]  vangogh_print_clk_levels+0x369/0x410 [amdgpu] [
492.658245]  vangogh_common_print_clk_levels+0x77/0x80 [amdgpu]
[ 492.659733]  ? preempt_schedule_common+0x18/0x30 [
492.660713]
smu_print_ppclk_levels+0x65/0x90 [amdgpu] [ 492.662107]
amdgpu_get_pp_od_clk_voltage+0x13d/0x190 [amdgpu] [ 492.663620]
dev_attr_show+0x1d/0x40
Mhm, that at least partially doesn't looks like the right
approach to me.

Why do we have string printing and sysfs code in the hardware
version specific backend in the first place?

This is a callback meant for printing ASIC specific information
to sysfs node. The buffer passed in sysfs read is passed as it is
to the callback API.

That stuff needs to be implemented for each hardware generation
and is now cluttered with sysfs buffer offset calculations.

Looks like the warning happened because of this usage.

                   size = amdgpu_dpm_print_clock_levels(adev,
OD_SCLK, buf);
                   size += amdgpu_dpm_print_clock_levels(adev,
OD_MCLK,
buf+size);
                   size += amdgpu_dpm_print_clock_levels(adev,
OD_VDDC_CURVE, buf+size);
                   size += amdgpu_dpm_print_clock_levels(adev,
OD_VDDGFX_OFFSET, buf+size);
                   size += amdgpu_dpm_print_clock_levels(adev,
OD_RANGE,
buf+size);
                   size += amdgpu_dpm_print_clock_levels(adev,
OD_CCLK,
buf+size);


[Yu, Lang]
Yes. So it is fine we just fix the caller
amdgpu_get_pp_od_clk_voltage like
following:
static ssize_t amdgpu_get_pp_od_clk_voltage(struct device *dev,
         struct device_attribute *attr,
         char *buf)
{
     struct drm_device *ddev = dev_get_drvdata(dev);
     struct amdgpu_device *adev = drm_to_adev(ddev);
     ssize_t size, offset;
     int ret, i;
     char temp_buf[512];
     char clock_type[] = {OD_SCLK, OD_MCLK, OD_VDDC_CURVE,
                          OD_VDDGFX_OFFSET, OD_RANGE, OD_CCLK};

     if (amdgpu_in_reset(adev))
         return -EPERM;
     if (adev->in_suspend && !adev->in_runpm)
         return -EPERM;

     ret = pm_runtime_get_sync(ddev->dev);
     if (ret < 0) {
         pm_runtime_put_autosuspend(ddev->dev);
         return ret;
     }

     offset = 0;

     if (adev->powerplay.pp_funcs->print_clock_levels) {
         for (i = 0; i < ARRAY_SIZE(clock_type); i++) {
             size = amdgpu_dpm_print_clock_levels(adev,
clock_type[i], buf);
             if (offset + size > PAGE_SIZE)
                 break;
             memcpy(temp_buf + offset, buf, size);
             offset += size;
         }
         memcpy(buf, temp_buf, offset);
         size = offset;
     } else {
         size = sysfs_emit(buf, "\n");
     }
     pm_runtime_mark_last_busy(ddev->dev);
     pm_runtime_put_autosuspend(ddev->dev);

     return size;
}

Prefer to avoid any extra stack or heap usage for buffer. Maybe
another arg to pass offset along with buf?

[Yu, Lang]
Actually, the buf address contains the offset(offset_in_page(buf)) .
Though it's not a problem based on codeflow, static analysis tools
might complain.

Or we just rollback to sprintf/snprintf.

snprintf with (PAGE_SIZE-size) may be simpler. I think Darren took
the effort to convert these, he may have some other ideas.
This is not what I meant. See from the design point of view the
print_clock_levels() callback is the bad idea to begin with.

What we should have instead is a callback which returns the clock as a
value which is then printed in the amdgpu_get_pp_od_clk_voltage() function.

This avoids passing around the buffer and remaining size everywhere
and also guarantees that the sysfs have unified printing over all
hardware generations.

The scenario is one node used for multiple parameters - OD_SCLK, OD_CCLK,
OD_VDDGFX_OFFSET etc.(mostly to avoid cluttering sysfs with lots of nodes). On
top of it, the parameters supported (for ex: CCLK is not valid on dGPUs),  the
number of levels supported etc. vary across ASICs. There has to be multiple calls
or the call needs to return multiple values for a single parameter (for ex: up to 4,
8 or 16 levels of GFXCLK depending on ASIC).

I don't know the history of the callback, mostly it was considered more efficient
to print it directly rather than fetch and print. Alex/Evan may know the details.

Thanks,
Lijo

Regards,
Christian.





[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux