RE: [PATCH] drm/amdgpu: support gpu recovery tests on compute rings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



How about amdgpu.lockup_timeout=non-compute-jobs[, gfx, sdma, decode, encode][: compute-jobs] ?

This will not break backward compatibility.

 

And I’m not sure how to map “decode” and “encode” to the uvd/vce/vcn rings.

Since there are many rings related with these IPs(uvd, uvd_enc, vce, vcn_dec, vcn_enc, vcn_jpeg).

Maybe we should use IP name(uvd, vce or vcn) instead of “decode/encode”?

 

Regards,

Evan

From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of Deucher, Alexander
Sent: 2019426 22:24
To: Michel Dänzer <michel@xxxxxxxxxxx>; Quan, Evan <Evan.Quan@xxxxxxx>; Koenig, Christian <Christian.Koenig@xxxxxxx>
Cc: Xu, Feifei <Feifei.Xu@xxxxxxx>; Cui, Flora <Flora.Cui@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Subject: Re: [PATCH] drm/amdgpu: support gpu recovery tests on compute rings

 

How about an interface to change the timeout on a per engine (gfx, compute, dma, etc.) basis?

amdgpu.lockup_timeout=<global>,<gfx>,<compute>,<sdma>,<decode>,<encode>]

if only one parameter is given, we change it globably.  If more are given, we override the global one.  Could also do a sysfs interface to change it on the fly.

 

Alex


From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> on behalf of Michel Dänzer <michel@xxxxxxxxxxx>
Sent: Friday, April 26, 2019 4:35 AM
To: Quan, Evan; Koenig, Christian
Cc: Xu, Feifei; Cui, Flora; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Subject: Re: [PATCH] drm/amdgpu: support gpu recovery tests on compute rings

 

On 2019-04-26 10:20 a.m., Quan, Evan wrote:
> My concern is there is already one module parameter "lockup_timeout".
> parm:           lockup_timeout:GPU lockup timeout in ms > 0 (default 10000) (int)
>
> Adding one more "timeout" seems redundant.
> And that will makes the description of "lockup_timeout"(seems working for all jobs) does not match its real effect(affect only non-compute jobs).
>
> A better way is to rename "lockup_timeout" to "non-compute lockup_timeout". But I do not think we can change existing module parameter. Right?

Right. Also, there are already too many amdgpu module parameters, we
should try to remove some rather than adding new ones for every little
thing that could be tweaked. :)

One possibility might be to optionally allow passing multiple values to
lockup_timeout, e.g.

 amdgpu.lockup_timeout=10000,0

The first value would need to have the same meaning as now for backwards
compatibility.


--
Earthling Michel Dänzer               |              https://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux