Looking at radeon_uvd source code and AFAIK it's currently disable
since commit dca5086a90c9ec64f4e0de801a659508202b0640What do you think?
On Fri, Aug 15, 2014 at 12:32 PM, Grigori Goronzy <greg@xxxxxxxxxxxx> wrote:
On 15.08.2014 17:26, Alex Deucher wrote:Is it? In 3.17-wip, dynamic UVD state selection (according to active
> On Fri, Aug 15, 2014 at 11:20 AM, Grigori Goronzy <greg@xxxxxxxxxxxx> wrote:
>> On 15.08.2014 16:11, Christian König wrote:
>>> Hi Marco,
>>>
>>> the problem with an CS ioctl flag is that we sometimes don't know how
>>> much SCLK/MCLK boost is needed, for example when we do post processing
>>> in the player using OpenGL and UVD decoding with VDPAU. In this case
>>> VDPAU don't has the slightest idea how high SCLK/MCLK must be and so
>>> can't give that info to the kernel either.
>>>
>>
>> Maybe it's an acceptable workaround to simply disable dynamic UVD state
>> selection in case the UVD states only have a single power level. That
>> will avoid the performance issues on affected systems, while still
>> allowing dynamic UVD states on systems that have a saner DPM table
>> setup. I think it is mosly older systems that suffer from this.
>>
>
> That is exactly what we do now.
>
streams) is still completely disabled. It will always use the generic
UVD state. In fact wasn't it reverted again because of the performance
issues on some systems?
Grigori
> Alex
>
>
>> Best regards
>> Grigori
>>
>>> Regards,
>>> Christian.
>>>
>>> Am 15.08.2014 um 15:21 schrieb Marco Benatto:
>>>> Hey all,
>>>>
>>>> I also had a talk with Alex yesterday about post-processing issues
>>>> when using dynamic UVD profiles and a chamge on CS ioctl
>>>> including a flag to let user mode driver tell to the kernel which
>>>> performance requirement it wants for post processing. A commom
>>>> point for both discussion is to stablish the default values for these
>>>> profiles, but probably this ioctl change would be more impacting/complex
>>>> to implement than a sysfs entry.
>>>>
>>>> If a sysfs entry is anough for now I can handle the code to create it
>>>> and, with your help, the code to setup the UVD profile requested
>>>> through it.
>>>>
>>>> Is there any suggestion?
>>>>
>>>> Thanks all for your help,
>>>>
>>>>
>>>> On Fri, Aug 15, 2014 at 5:48 AM, Christian König
>>>> <christian.koenig@xxxxxxx <mailto:christian.koenig@xxxxxxx>> wrote:
>>>>
>>>> Hi guys,
>>>>
>>>> to make a long story short every time I watch a movie my laptop
>>>> start to heat up because we always select the standard UVD power
>>>> profile without actually measuring if that is necessary.
>>>>
>>>> Marco came up with a patch that seems to reliable measure the fps
>>>> send down to the kernel and so together with knowing the frame
>>>> size of the video should allow us to select the right UVD power
>>>> profile.
>>>>
>>>> The problem is that Alex (unnoticed by me) completely disabled
>>>> selecting the UVD profiles because of some issues with advanced
>>>> post processing discussed on IRC. The problem seems to be that the
>>>> lower UVD profiles have a to low SCLK/MCLK to handle the 3D load
>>>> that comes with scaling, deinterlacing etc...
>>>>
>>>> I unfortunately don't have time for it, cause this only affects
>>>> the hardware generations R600-SI and not the newest one CIK. So
>>>> could you guys stick together and come up with a solution?
>>>> Something like a sysfs entry that let's us select the minimum UVD
>>>> power level allowed?
>>>>
>>>> I think Marco is happy to come up with a patch, we just need to
>>>> know what's really needed and what should be the default values.
>>>> I'm happy to review everything that comes out of it, just don't
>>>> have time to do it myself.
>>>>
>>>> Happy discussion and thanks in advance,
>>>> Christian.
>>>>
>>>> Am 12.08.2014 um 15:05 schrieb Alex Deucher:
>>>>
>>>> On Tue, Aug 12, 2014 at 6:00 AM, Christian König
>>>> <deathsimple@xxxxxxxxxxx <mailto:deathsimple@xxxxxxxxxxx>> wrote:
>>>>
>>>> Am 11.08.2014 um 16:52 schrieb Alex Deucher:
>>>>
>>>> On Mon, Aug 11, 2014 at 5:08 AM, Christian König
>>>> <deathsimple@xxxxxxxxxxx
>>>> <mailto:deathsimple@xxxxxxxxxxx>> wrote:
>>>>
>>>> Am 07.08.2014 um 21:43 schrieb Alex Deucher:
>>>>
>>>> On Thu, Aug 7, 2014 at 11:32 AM, Christian König
>>>> <deathsimple@xxxxxxxxxxx
>>>> <mailto:deathsimple@xxxxxxxxxxx>> wrote:
>>>>
>>>> Am 07.08.2014 um 16:32 schrieb Alex Deucher:
>>>>
>>>> On Thu, Aug 7, 2014 at 7:33 AM,
>>>> Christian König
>>>> <deathsimple@xxxxxxxxxxx
>>>> <mailto:deathsimple@xxxxxxxxxxx>>
>>>> wrote:
>>>>
>>>> From: Marco A Benatto
>>>> <marco.antonio.780@xxxxxxxxx
>>>> <mailto:marco.antonio.780@xxxxxxxxx>>
>>>>
>>>> Adding a Frames Per Second
>>>> estimation logic on UVD handles
>>>> when it has being used. This
>>>> estimation is per handle basis
>>>> and will help on DPM profile
>>>> calculation.
>>>>
>>>> v2 (chk): fix timestamp type, move
>>>> functions around and
>>>> cleanup code a bit.
>>>>
>>>> Will this really help much? I thought
>>>> the problem was mainly due to
>>>> sclk and mclk for post processing.
>>>>
>>>>
>>>> It should at least handle the UVD side for
>>>> upclocking when you get a
>>>> lot
>>>> of
>>>> streams / fps. And at on my NI the patch
>>>> seems to do exactly that.
>>>>
>>>> Switching sclk and mclk for post
>>>> processing is a different task, and I
>>>> actually have no idea what to do with them.
>>>>
>>>> At this point we always choose the plain UVD
>>>> state anyway so this
>>>> patch would only take effect if we re-enabled
>>>> the dynamic UVD state
>>>> selection.
>>>>
>>>>
>>>> Hui? I thought we already re-enabled the dynamic
>>>> UVD state selection, but
>>>> double checking this I found it disabled again.
>>>>
>>>> What was the problem with that? Looks like I
>>>> somehow missed the
>>>> discussion
>>>> around it.
>>>>
>>>> We did, but after doing so a number of people
>>>> complained about a
>>>> regression on IRC because when apps like xmbc enabled
>>>> post processing,
>>>> performance went down.
>>>>
>>>>
>>>> That's strange, from my experience the different UVD
>>>> performance states only
>>>> affect UVDs dclk/vclk, not sclk/mclk. I need to get the
>>>> DPM dumps to
>>>> confirms this.
>>>>
>>>> The sclks and mclks are usually different as well, especially
>>>> on APUs.
>>>> I can send you some examples.
>>>>
>>>> You not off hand remember who complained on IRC? Finding
>>>> something in the
>>>> IRC logs is like searching for a needle in a haystack.
>>>>
>>>> I don't remember off hand. I think zgreg was involved in some
>>>> of the
>>>> discussions.
>>>>
>>>> Alex
>>>>
>>>> Thanks,
>>>> Christian.
>>>>
>>>>
>>>> Alex
>>>>
>>>>
>>>> Christian.
>>>>
>>>>
>>>> For the post processing, we probably need a
>>>> hint we can
>>>> pass to the driver in the CS ioctl to denote
>>>> what state we need.
>>>> Although if we did that, this could would
>>>> largely be moot. That said,
>>>> newer asics support dynamic UVD clocks so we
>>>> really only need
>>>> something like that for older asics and I
>>>> guess VCE.
>>>>
>>>> Alex
>>>>
>>>> Christian.
>>>>
>>>>
>>>> Alex
>>>>
>>>> Signed-off-by: Marco A Benatto
>>>> <marco.antonio.780@xxxxxxxxx
>>>> <mailto:marco.antonio.780@xxxxxxxxx>>
>>>> Signed-off-by: Christian König
>>>> <christian.koenig@xxxxxxx
>>>> <mailto:christian.koenig@xxxxxxx>>
>>>> ---
>>>>
>>>> drivers/gpu/drm/radeon/radeon.h
>>>> | 10 ++++++
>>>>
>>>> drivers/gpu/drm/radeon/radeon_uvd.c
>>>> | 64
>>>> +++++++++++++++++++++++++++++++++----
>>>> 2 files changed, 68
>>>> insertions(+), 6 deletions(-)
>>>>
>>>> diff --git
>>>> a/drivers/gpu/drm/radeon/radeon.h
>>>> b/drivers/gpu/drm/radeon/radeon.h
>>>> index 9e1732e..e92f6cb 100644
>>>> --- a/drivers/gpu/drm/radeon/radeon.h
>>>> +++ b/drivers/gpu/drm/radeon/radeon.h
>>>> @@ -1617,6 +1617,15 @@ int
>>>> radeon_pm_get_type_index(struct
>>>> radeon_device
>>>> *rdev,
>>>> #define
>>>> RADEON_UVD_STACK_SIZE (1024*1024)
>>>> #define RADEON_UVD_HEAP_SIZE
>>>> (1024*1024)
>>>>
>>>> +#define RADEON_UVD_FPS_EVENTS_MAX 8
>>>> +#define RADEON_UVD_DEFAULT_FPS 60
>>>> +
>>>> +struct radeon_uvd_fps {
>>>> + uint64_t timestamp;
>>>> + uint8_t event_index;
>>>> + uint8_t
>>>> events[RADEON_UVD_FPS_EVENTS_MAX];
>>>> +};
>>>> +
>>>> struct radeon_uvd {
>>>> struct radeon_bo
>>>> *vcpu_bo;
>>>> void
>>>> *cpu_addr;
>>>> @@ -1626,6 +1635,7 @@ struct
>>>> radeon_uvd {
>>>> struct drm_file
>>>> *filp[RADEON_MAX_UVD_HANDLES];
>>>> unsigned
>>>> img_size[RADEON_MAX_UVD_HANDLES];
>>>> struct delayed_work
>>>> idle_work;
>>>> + struct radeon_uvd_fps
>>>> fps_info[RADEON_MAX_UVD_HANDLES];
>>>> };
>>>>
>>>> int radeon_uvd_init(struct
>>>> radeon_device *rdev);
>>>> diff --git
>>>> a/drivers/gpu/drm/radeon/radeon_uvd.c
>>>> b/drivers/gpu/drm/radeon/radeon_uvd.c
>>>> index 6bf55ec..ef5667a 100644
>>>> ---
>>>> a/drivers/gpu/drm/radeon/radeon_uvd.c
>>>> +++
>>>> b/drivers/gpu/drm/radeon/radeon_uvd.c
>>>> @@ -237,6 +237,51 @@ void
>>>> radeon_uvd_force_into_uvd_segment(struct
>>>> radeon_bo *rbo)
>>>> rbo->placement.lpfn =
>>>> (256 * 1024 * 1024) >> PAGE_SHIFT;
>>>> }
>>>>
>>>> +static void
>>>> radeon_uvd_fps_clear_events(struct
>>>> radeon_device *rdev,
>>>> int
>>>> idx)
>>>> +{
>>>> + struct radeon_uvd_fps *fps
>>>> = &rdev->uvd.fps_info[idx];
>>>> + unsigned i;
>>>> +
>>>> + fps->timestamp = jiffies_64;
>>>> + fps->event_index = 0;
>>>> + for (i = 0; i <
>>>> RADEON_UVD_FPS_EVENTS_MAX; i++)
>>>> + fps->events[i] = 0;
>>>> +}
>>>> +
>>>> +static void
>>>> radeon_uvd_fps_note_event(struct
>>>> radeon_device *rdev,
>>>> int
>>>> idx)
>>>> +{
>>>> + struct radeon_uvd_fps *fps
>>>> = &rdev->uvd.fps_info[idx];
>>>> + uint64_t timestamp =
>>>> jiffies_64;
>>>> + unsigned rate = 0;
>>>> +
>>>> + uint8_t index =
>>>> fps->event_index++;
>>>> + fps->event_index %=
>>>> RADEON_UVD_FPS_EVENTS_MAX;
>>>> +
>>>> + rate = div64_u64(HZ,
>>>> max(timestamp - fps->timestamp,
>>>> 1ULL));
>>>> +
>>>> + fps->timestamp = timestamp;
>>>> + fps->events[index] =
>>>> min(rate, 120u);
>>>> +}
>>>> +
>>>> +static unsigned
>>>> radeon_uvd_estimate_fps(struct
>>>> radeon_device *rdev,
>>>> int
>>>> idx)
>>>> +{
>>>> + struct radeon_uvd_fps *fps
>>>> = &rdev->uvd.fps_info[idx];
>>>> + unsigned i, valid = 0,
>>>> count = 0;
>>>> +
>>>> + for (i = 0; i <
>>>> RADEON_UVD_FPS_EVENTS_MAX; i++) {
>>>> + /* We should
>>>> ignore zero values */
>>>> + if (fps->events[i]
>>>> != 0) {
>>>> + count +=
>>>> fps->events[i];
>>>> + valid++;
>>>> + }
>>>> + }
>>>> +
>>>> + if (valid > 0)
>>>> + return count / valid;
>>>> + else
>>>> + return
>>>> RADEON_UVD_DEFAULT_FPS;
>>>> +}
>>>> +
>>>> void
>>>> radeon_uvd_free_handles(struct
>>>> radeon_device *rdev, struct
>>>> drm_file *filp)
>>>> {
>>>> int i, r;
>>>> @@ -419,8 +464,10 @@ static int
>>>> radeon_uvd_cs_msg(struct
>>>> radeon_cs_parser
>>>> *p, struct radeon_bo *bo,
>>>>
>>>> /* create or decode,
>>>> validate the handle */
>>>> for (i = 0; i <
>>>> RADEON_MAX_UVD_HANDLES; ++i) {
>>>> - if
>>>> (atomic_read(&p->rdev->uvd.handles[i])
>>>> == handle)
>>>> + if
>>>> (atomic_read(&p->rdev->uvd.handles[i])
>>>> == handle)
>>>> {
>>>> +
>>>> radeon_uvd_fps_note_event(p->rdev, i);
>>>> return 0;
>>>> + }
>>>> }
>>>>
>>>> /* handle not found
>>>> try to alloc a new one */
>>>> @@ -428,6 +475,7 @@ static int
>>>> radeon_uvd_cs_msg(struct
>>>> radeon_cs_parser
>>>> *p, struct radeon_bo *bo,
>>>> if
>>>> (!atomic_cmpxchg(&p->rdev->uvd.handles[i],
>>>> 0,
>>>> handle)) {
>>>>
>>>> p->rdev->uvd.filp[i] = p->filp;
>>>>
>>>> p->rdev->uvd.img_size[i] = img_size;
>>>> +
>>>> radeon_uvd_fps_clear_events(p->rdev,
>>>> i);
>>>> return 0;
>>>> }
>>>> }
>>>> @@ -763,7 +811,7 @@ int
>>>> radeon_uvd_get_destroy_msg(struct
>>>> radeon_device
>>>> *rdev, int ring,
>>>> static void
>>>> radeon_uvd_count_handles(struct
>>>> radeon_device *rdev,
>>>>
>>>> unsigned *sd, unsigned *hd)
>>>> {
>>>> - unsigned i;
>>>> + unsigned i, fps_rate = 0;
>>>>
>>>> *sd = 0;
>>>> *hd = 0;
>>>> @@ -772,10 +820,13 @@ static void
>>>> radeon_uvd_count_handles(struct
>>>> radeon_device *rdev,
>>>> if
>>>> (!atomic_read(&rdev->uvd.handles[i]))
>>>> continue;
>>>>
>>>> - if
>>>> (rdev->uvd.img_size[i] >= 720*576)
>>>> - ++(*hd);
>>>> - else
>>>> - ++(*sd);
>>>> + fps_rate =
>>>> radeon_uvd_estimate_fps(rdev, i);
>>>> +
>>>> + if
>>>> (rdev->uvd.img_size[i] >= 720*576) {
>>>> + (*hd) +=
>>>> fps_rate > 30 ? 1 : 2;
>>>> + } else {
>>>> + (*sd) +=
>>>> fps_rate > 30 ? 1 : 2;
>>>> + }
>>>> }
>>>> }
>>>>
>>>> @@ -805,6 +856,7 @@ void
>>>> radeon_uvd_note_usage(struct
>>>> radeon_device
>>>> *rdev)
>>>> set_clocks &=
>>>> schedule_delayed_work(&rdev->uvd.idle_work,
>>>>
>>>> msecs_to_jiffies(UVD_IDLE_TIMEOUT_MS));
>>>>
>>>> +
>>>> if
>>>> ((rdev->pm.pm_method ==
>>>> PM_METHOD_DPM) &&
>>>> rdev->pm.dpm_enabled) {
>>>> unsigned hd =
>>>> 0, sd = 0;
>>>>
>>>> radeon_uvd_count_handles(rdev,
>>>> &sd, &hd);
>>>> --
>>>> 1.9.1
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Marco Antonio Benatto
>>>> Linux user ID:#506236
>>>
>>
>>
--
Marco Antonio Benatto
Linux user ID: #506236
_______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel