On Tue, Aug 12, 2014 at 6:00 AM, Christian König <deathsimple@xxxxxxxxxxx> wrote: > Am 11.08.2014 um 16:52 schrieb Alex Deucher: > >> On Mon, Aug 11, 2014 at 5:08 AM, Christian König >> <deathsimple@xxxxxxxxxxx> wrote: >>> >>> Am 07.08.2014 um 21:43 schrieb Alex Deucher: >>> >>>> On Thu, Aug 7, 2014 at 11:32 AM, Christian König >>>> <deathsimple@xxxxxxxxxxx> wrote: >>>>> >>>>> Am 07.08.2014 um 16:32 schrieb Alex Deucher: >>>>> >>>>>> On Thu, Aug 7, 2014 at 7:33 AM, Christian König >>>>>> <deathsimple@xxxxxxxxxxx> >>>>>> wrote: >>>>>>> >>>>>>> From: Marco A Benatto <marco.antonio.780@xxxxxxxxx> >>>>>>> >>>>>>> Adding a Frames Per Second estimation logic on UVD handles >>>>>>> when it has being used. This estimation is per handle basis >>>>>>> and will help on DPM profile calculation. >>>>>>> >>>>>>> v2 (chk): fix timestamp type, move functions around and >>>>>>> cleanup code a bit. >>>>>> >>>>>> Will this really help much? I thought the problem was mainly due to >>>>>> sclk and mclk for post processing. >>>>> >>>>> >>>>> It should at least handle the UVD side for upclocking when you get a >>>>> lot >>>>> of >>>>> streams / fps. And at on my NI the patch seems to do exactly that. >>>>> >>>>> Switching sclk and mclk for post processing is a different task, and I >>>>> actually have no idea what to do with them. >>>> >>>> At this point we always choose the plain UVD state anyway so this >>>> patch would only take effect if we re-enabled the dynamic UVD state >>>> selection. >>> >>> >>> Hui? I thought we already re-enabled the dynamic UVD state selection, but >>> double checking this I found it disabled again. >>> >>> What was the problem with that? Looks like I somehow missed the >>> discussion >>> around it. >> >> We did, but after doing so a number of people complained about a >> regression on IRC because when apps like xmbc enabled post processing, >> performance went down. > > > That's strange, from my experience the different UVD performance states only > affect UVDs dclk/vclk, not sclk/mclk. I need to get the DPM dumps to > confirms this. > The sclks and mclks are usually different as well, especially on APUs. I can send you some examples. > You not off hand remember who complained on IRC? Finding something in the > IRC logs is like searching for a needle in a haystack. I don't remember off hand. I think zgreg was involved in some of the discussions. Alex > > Thanks, > Christian. > > >> >> Alex >> >> >>> Christian. >>> >>> >>>> For the post processing, we probably need a hint we can >>>> pass to the driver in the CS ioctl to denote what state we need. >>>> Although if we did that, this could would largely be moot. That said, >>>> newer asics support dynamic UVD clocks so we really only need >>>> something like that for older asics and I guess VCE. >>>> >>>> Alex >>>> >>>>> Christian. >>>>> >>>>> >>>>>> Alex >>>>>> >>>>>>> Signed-off-by: Marco A Benatto <marco.antonio.780@xxxxxxxxx> >>>>>>> Signed-off-by: Christian König <christian.koenig@xxxxxxx> >>>>>>> --- >>>>>>> drivers/gpu/drm/radeon/radeon.h | 10 ++++++ >>>>>>> drivers/gpu/drm/radeon/radeon_uvd.c | 64 >>>>>>> +++++++++++++++++++++++++++++++++---- >>>>>>> 2 files changed, 68 insertions(+), 6 deletions(-) >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/radeon/radeon.h >>>>>>> b/drivers/gpu/drm/radeon/radeon.h >>>>>>> index 9e1732e..e92f6cb 100644 >>>>>>> --- a/drivers/gpu/drm/radeon/radeon.h >>>>>>> +++ b/drivers/gpu/drm/radeon/radeon.h >>>>>>> @@ -1617,6 +1617,15 @@ int radeon_pm_get_type_index(struct >>>>>>> radeon_device >>>>>>> *rdev, >>>>>>> #define RADEON_UVD_STACK_SIZE (1024*1024) >>>>>>> #define RADEON_UVD_HEAP_SIZE (1024*1024) >>>>>>> >>>>>>> +#define RADEON_UVD_FPS_EVENTS_MAX 8 >>>>>>> +#define RADEON_UVD_DEFAULT_FPS 60 >>>>>>> + >>>>>>> +struct radeon_uvd_fps { >>>>>>> + uint64_t timestamp; >>>>>>> + uint8_t event_index; >>>>>>> + uint8_t events[RADEON_UVD_FPS_EVENTS_MAX]; >>>>>>> +}; >>>>>>> + >>>>>>> struct radeon_uvd { >>>>>>> struct radeon_bo *vcpu_bo; >>>>>>> void *cpu_addr; >>>>>>> @@ -1626,6 +1635,7 @@ struct radeon_uvd { >>>>>>> struct drm_file *filp[RADEON_MAX_UVD_HANDLES]; >>>>>>> unsigned img_size[RADEON_MAX_UVD_HANDLES]; >>>>>>> struct delayed_work idle_work; >>>>>>> + struct radeon_uvd_fps fps_info[RADEON_MAX_UVD_HANDLES]; >>>>>>> }; >>>>>>> >>>>>>> int radeon_uvd_init(struct radeon_device *rdev); >>>>>>> diff --git a/drivers/gpu/drm/radeon/radeon_uvd.c >>>>>>> b/drivers/gpu/drm/radeon/radeon_uvd.c >>>>>>> index 6bf55ec..ef5667a 100644 >>>>>>> --- a/drivers/gpu/drm/radeon/radeon_uvd.c >>>>>>> +++ b/drivers/gpu/drm/radeon/radeon_uvd.c >>>>>>> @@ -237,6 +237,51 @@ void radeon_uvd_force_into_uvd_segment(struct >>>>>>> radeon_bo *rbo) >>>>>>> rbo->placement.lpfn = (256 * 1024 * 1024) >> PAGE_SHIFT; >>>>>>> } >>>>>>> >>>>>>> +static void radeon_uvd_fps_clear_events(struct radeon_device *rdev, >>>>>>> int >>>>>>> idx) >>>>>>> +{ >>>>>>> + struct radeon_uvd_fps *fps = &rdev->uvd.fps_info[idx]; >>>>>>> + unsigned i; >>>>>>> + >>>>>>> + fps->timestamp = jiffies_64; >>>>>>> + fps->event_index = 0; >>>>>>> + for (i = 0; i < RADEON_UVD_FPS_EVENTS_MAX; i++) >>>>>>> + fps->events[i] = 0; >>>>>>> +} >>>>>>> + >>>>>>> +static void radeon_uvd_fps_note_event(struct radeon_device *rdev, >>>>>>> int >>>>>>> idx) >>>>>>> +{ >>>>>>> + struct radeon_uvd_fps *fps = &rdev->uvd.fps_info[idx]; >>>>>>> + uint64_t timestamp = jiffies_64; >>>>>>> + unsigned rate = 0; >>>>>>> + >>>>>>> + uint8_t index = fps->event_index++; >>>>>>> + fps->event_index %= RADEON_UVD_FPS_EVENTS_MAX; >>>>>>> + >>>>>>> + rate = div64_u64(HZ, max(timestamp - fps->timestamp, 1ULL)); >>>>>>> + >>>>>>> + fps->timestamp = timestamp; >>>>>>> + fps->events[index] = min(rate, 120u); >>>>>>> +} >>>>>>> + >>>>>>> +static unsigned radeon_uvd_estimate_fps(struct radeon_device *rdev, >>>>>>> int >>>>>>> idx) >>>>>>> +{ >>>>>>> + struct radeon_uvd_fps *fps = &rdev->uvd.fps_info[idx]; >>>>>>> + unsigned i, valid = 0, count = 0; >>>>>>> + >>>>>>> + for (i = 0; i < RADEON_UVD_FPS_EVENTS_MAX; i++) { >>>>>>> + /* We should ignore zero values */ >>>>>>> + if (fps->events[i] != 0) { >>>>>>> + count += fps->events[i]; >>>>>>> + valid++; >>>>>>> + } >>>>>>> + } >>>>>>> + >>>>>>> + if (valid > 0) >>>>>>> + return count / valid; >>>>>>> + else >>>>>>> + return RADEON_UVD_DEFAULT_FPS; >>>>>>> +} >>>>>>> + >>>>>>> void radeon_uvd_free_handles(struct radeon_device *rdev, struct >>>>>>> drm_file *filp) >>>>>>> { >>>>>>> int i, r; >>>>>>> @@ -419,8 +464,10 @@ static int radeon_uvd_cs_msg(struct >>>>>>> radeon_cs_parser >>>>>>> *p, struct radeon_bo *bo, >>>>>>> >>>>>>> /* create or decode, validate the handle */ >>>>>>> for (i = 0; i < RADEON_MAX_UVD_HANDLES; ++i) { >>>>>>> - if (atomic_read(&p->rdev->uvd.handles[i]) == handle) >>>>>>> + if (atomic_read(&p->rdev->uvd.handles[i]) == handle) >>>>>>> { >>>>>>> + radeon_uvd_fps_note_event(p->rdev, i); >>>>>>> return 0; >>>>>>> + } >>>>>>> } >>>>>>> >>>>>>> /* handle not found try to alloc a new one */ >>>>>>> @@ -428,6 +475,7 @@ static int radeon_uvd_cs_msg(struct >>>>>>> radeon_cs_parser >>>>>>> *p, struct radeon_bo *bo, >>>>>>> if (!atomic_cmpxchg(&p->rdev->uvd.handles[i], 0, >>>>>>> handle)) { >>>>>>> p->rdev->uvd.filp[i] = p->filp; >>>>>>> p->rdev->uvd.img_size[i] = img_size; >>>>>>> + radeon_uvd_fps_clear_events(p->rdev, i); >>>>>>> return 0; >>>>>>> } >>>>>>> } >>>>>>> @@ -763,7 +811,7 @@ int radeon_uvd_get_destroy_msg(struct >>>>>>> radeon_device >>>>>>> *rdev, int ring, >>>>>>> static void radeon_uvd_count_handles(struct radeon_device *rdev, >>>>>>> unsigned *sd, unsigned *hd) >>>>>>> { >>>>>>> - unsigned i; >>>>>>> + unsigned i, fps_rate = 0; >>>>>>> >>>>>>> *sd = 0; >>>>>>> *hd = 0; >>>>>>> @@ -772,10 +820,13 @@ static void radeon_uvd_count_handles(struct >>>>>>> radeon_device *rdev, >>>>>>> if (!atomic_read(&rdev->uvd.handles[i])) >>>>>>> continue; >>>>>>> >>>>>>> - if (rdev->uvd.img_size[i] >= 720*576) >>>>>>> - ++(*hd); >>>>>>> - else >>>>>>> - ++(*sd); >>>>>>> + fps_rate = radeon_uvd_estimate_fps(rdev, i); >>>>>>> + >>>>>>> + if (rdev->uvd.img_size[i] >= 720*576) { >>>>>>> + (*hd) += fps_rate > 30 ? 1 : 2; >>>>>>> + } else { >>>>>>> + (*sd) += fps_rate > 30 ? 1 : 2; >>>>>>> + } >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> @@ -805,6 +856,7 @@ void radeon_uvd_note_usage(struct radeon_device >>>>>>> *rdev) >>>>>>> set_clocks &= schedule_delayed_work(&rdev->uvd.idle_work, >>>>>>> >>>>>>> msecs_to_jiffies(UVD_IDLE_TIMEOUT_MS)); >>>>>>> >>>>>>> + >>>>>>> if ((rdev->pm.pm_method == PM_METHOD_DPM) && >>>>>>> rdev->pm.dpm_enabled) { >>>>>>> unsigned hd = 0, sd = 0; >>>>>>> radeon_uvd_count_handles(rdev, &sd, &hd); >>>>>>> -- >>>>>>> 1.9.1 >>>>>>> > _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel