Re: [PATCH] drm/radeon: Adding UVD handle basis fps estimation v2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 15.08.2014 um 17:32 schrieb Grigori Goronzy:
On 15.08.2014 17:26, Alex Deucher wrote:
On Fri, Aug 15, 2014 at 11:20 AM, Grigori Goronzy <greg@xxxxxxxxxxxx> wrote:
On 15.08.2014 16:11, Christian König wrote:
Hi Marco,

the problem with an CS ioctl flag is that we sometimes don't know how
much SCLK/MCLK boost is needed, for example when we do post processing
in the player using OpenGL and UVD decoding with VDPAU. In this case
VDPAU don't has the slightest idea how high SCLK/MCLK must be and so
can't give that info to the kernel either.

Maybe it's an acceptable workaround to simply disable dynamic UVD state
selection in case the UVD states only have a single power level. That
will avoid the performance issues on affected systems, while still
allowing dynamic UVD states on systems that have a saner DPM table
setup. I think it is mosly older systems that suffer from this.

That is exactly what we do now.

Is it? In 3.17-wip, dynamic UVD state selection (according to active
streams) is still completely disabled. It will always use the generic
UVD state. In fact wasn't it reverted again because of the performance
issues on some systems?

This is the performance table of my laptop (at least the interesting parts), which I think is a typical example of the problem:

[    4.106772] == power state 1 ==
[    4.106774]     ui class: performance
[    4.106776]     internal class: none
[    4.106780]     uvd    vclk: 0 dclk: 0
[    4.106782]         power level 0    sclk: 20000 vddc_index: 2
[    4.106784]         power level 1    sclk: 50000 vddc_index: 2
[    4.106805] == power state 3 ==
[    4.106807]     ui class: none
[    4.106808]     internal class: uvd
[    4.106813]     uvd    vclk: 55000 dclk: 40000
[    4.106816]         power level 0    sclk: 50000 vddc_index: 2
[    4.106818]         power level 1    sclk: 50000 vddc_index: 2
[    4.106820]     status:
[    4.106822] == power state 4 ==
[    4.106823]     ui class: battery
[    4.106825]     internal class: uvd_hd
[    4.106831]     uvd    vclk: 40000 dclk: 30000
[    4.106833]         power level 0    sclk: 38000 vddc_index: 1
[    4.106835]         power level 1    sclk: 38000 vddc_index: 1
[    4.106839] == power state 5 ==
[    4.106841]     ui class: battery
[    4.106843]     internal class: uvd_sd
[    4.106848]     uvd    vclk: 40000 dclk: 30000
[    4.106850]         power level 0    sclk: 38000 vddc_index: 2
[    4.106853]         power level 1    sclk: 38000 vddc_index: 2

As you can see we currently always select the performance level uvd, which results in selecting the maximum sclk/dclk and vclk. Unfortunately neither uvd, uvd_sd nor uvd_hd allows the hardware to switch the sclk once selected (it's a hardware limitation of older uvd blocks).

So for all cases where this is interesting you actually always have only a single power level to choose from.

Christian.


Grigori

Alex


Best regards
Grigori

Regards,
Christian.

Am 15.08.2014 um 15:21 schrieb Marco Benatto:
Hey all,

I also had a talk with Alex yesterday about post-processing issues
when using dynamic UVD profiles and a chamge on CS ioctl
including a flag to let user mode driver tell to the kernel which
performance requirement it wants for post processing. A commom
point for both discussion is to stablish the default values for these
profiles, but probably this ioctl change would be more impacting/complex
to implement than a sysfs entry.

If a sysfs entry is anough for now I can handle the code to create it
and, with your help, the code to setup the UVD profile requested
through it.

Is there any suggestion?

Thanks all for your help,


On Fri, Aug 15, 2014 at 5:48 AM, Christian König
<christian.koenig@xxxxxxx <mailto:christian.koenig@xxxxxxx>> wrote:

     Hi guys,

     to make a long story short every time I watch a movie my laptop
     start to heat up because we always select the standard UVD power
     profile without actually measuring if that is necessary.

     Marco came up with a patch that seems to reliable measure the fps
     send down to the kernel and so together with knowing the frame
     size of the video should allow us to select the right UVD power
     profile.

     The problem is that Alex (unnoticed by me) completely disabled
     selecting the UVD profiles because of some issues with advanced
     post processing discussed on IRC. The problem seems to be that the
     lower UVD profiles have a to low SCLK/MCLK to handle the 3D load
     that comes with scaling, deinterlacing etc...

     I unfortunately don't have time for it, cause this only affects
     the hardware generations R600-SI and not the newest one CIK. So
     could you guys stick together and come up with a solution?
     Something like a sysfs entry that let's us select the minimum UVD
     power level allowed?

     I think Marco is happy to come up with a patch, we just need to
     know what's really needed and what should be the default values.
     I'm happy to review everything that comes out of it, just don't
     have time to do it myself.

     Happy discussion and thanks in advance,
     Christian.

     Am 12.08.2014 um 15:05 schrieb Alex Deucher:

         On Tue, Aug 12, 2014 at 6:00 AM, Christian König
         <deathsimple@xxxxxxxxxxx <mailto:deathsimple@xxxxxxxxxxx>> wrote:

             Am 11.08.2014 um 16:52 schrieb Alex Deucher:

                 On Mon, Aug 11, 2014 at 5:08 AM, Christian König
                 <deathsimple@xxxxxxxxxxx
                 <mailto:deathsimple@xxxxxxxxxxx>> wrote:

                     Am 07.08.2014 um 21:43 schrieb Alex Deucher:

                         On Thu, Aug 7, 2014 at 11:32 AM, Christian König
                         <deathsimple@xxxxxxxxxxx
                         <mailto:deathsimple@xxxxxxxxxxx>> wrote:

                             Am 07.08.2014 um 16:32 schrieb Alex Deucher:

                                 On Thu, Aug 7, 2014 at 7:33 AM,
                                 Christian König
                                 <deathsimple@xxxxxxxxxxx
                                 <mailto:deathsimple@xxxxxxxxxxx>>
                                 wrote:

                                     From: Marco A Benatto
                                     <marco.antonio.780@xxxxxxxxx
                                     <mailto:marco.antonio.780@xxxxxxxxx>>

                                     Adding a Frames Per Second
                                     estimation logic on UVD handles
                                     when it has being used. This
                                     estimation is per handle basis
                                     and will help on DPM profile
                                     calculation.

                                     v2 (chk): fix timestamp type, move
                                     functions around and
                                                   cleanup code a bit.

                                 Will this really help much?  I thought
                                 the problem was mainly due to
                                 sclk and mclk for post processing.


                             It should at least handle the UVD side for
                             upclocking when you get a
                             lot
                             of
                             streams / fps. And at on my NI the patch
                             seems to do exactly that.

                             Switching sclk and mclk for post
                             processing is a different task, and I
                             actually have no idea what to do with them.

                         At this point we always choose the plain UVD
                         state anyway so this
                         patch would only take effect if we re-enabled
                         the dynamic UVD state
                         selection.


                     Hui? I thought we already re-enabled the dynamic
                     UVD state selection, but
                     double checking this I found it disabled again.

                     What was the problem with that? Looks like I
                     somehow missed the
                     discussion
                     around it.

                 We did, but after doing so a number of people
                 complained about a
                 regression on IRC because when apps like xmbc enabled
                 post processing,
                 performance went down.


             That's strange, from my experience the different UVD
             performance states only
             affect UVDs dclk/vclk, not sclk/mclk. I need to get the
             DPM dumps to
             confirms this.

         The sclks and mclks are usually different as well, especially
         on APUs.
         I can send you some examples.

             You not off hand remember who complained on IRC? Finding
             something in the
             IRC logs is like searching for a needle in a haystack.

         I don't remember off hand.  I think zgreg was involved in some
         of the
         discussions.

         Alex

             Thanks,
             Christian.


                 Alex


                     Christian.


                         For the post processing, we probably need a
                         hint we can
                         pass to the driver in the CS ioctl to denote
                         what state we need.
                         Although if we did that, this could would
                         largely be moot.  That said,
                         newer asics support dynamic UVD clocks so we
                         really only need
                         something like that for older asics and I
                         guess VCE.

                         Alex

                             Christian.


                                 Alex

                                     Signed-off-by: Marco A Benatto
                                     <marco.antonio.780@xxxxxxxxx
                                     <mailto:marco.antonio.780@xxxxxxxxx>>
                                     Signed-off-by: Christian König
                                     <christian.koenig@xxxxxxx
                                     <mailto:christian.koenig@xxxxxxx>>
                                     ---

                                      drivers/gpu/drm/radeon/radeon.h
                                        | 10 ++++++

                                      drivers/gpu/drm/radeon/radeon_uvd.c
                                     | 64
                                     +++++++++++++++++++++++++++++++++----
                                          2 files changed, 68
                                     insertions(+), 6 deletions(-)

                                     diff --git
                                     a/drivers/gpu/drm/radeon/radeon.h
                                     b/drivers/gpu/drm/radeon/radeon.h
                                     index 9e1732e..e92f6cb 100644
                                     --- a/drivers/gpu/drm/radeon/radeon.h
                                     +++ b/drivers/gpu/drm/radeon/radeon.h
                                     @@ -1617,6 +1617,15 @@ int
                                     radeon_pm_get_type_index(struct
                                     radeon_device
                                     *rdev,
                                          #define
                                     RADEON_UVD_STACK_SIZE  (1024*1024)
                                          #define RADEON_UVD_HEAP_SIZE
                                      (1024*1024)

                                     +#define RADEON_UVD_FPS_EVENTS_MAX 8
                                     +#define RADEON_UVD_DEFAULT_FPS 60
                                     +
                                     +struct radeon_uvd_fps {
                                     +       uint64_t        timestamp;
                                     +       uint8_t         event_index;
                                     +       uint8_t
                                      events[RADEON_UVD_FPS_EVENTS_MAX];
                                     +};
                                     +
                                          struct radeon_uvd {
                                                 struct radeon_bo
                                       *vcpu_bo;
                                                 void
                                       *cpu_addr;
                                     @@ -1626,6 +1635,7 @@ struct
                                     radeon_uvd {
                                                 struct drm_file
                                      *filp[RADEON_MAX_UVD_HANDLES];
                                                 unsigned
                                       img_size[RADEON_MAX_UVD_HANDLES];
                                                 struct delayed_work
                                      idle_work;
                                     +       struct radeon_uvd_fps
                                      fps_info[RADEON_MAX_UVD_HANDLES];
                                          };

                                          int radeon_uvd_init(struct
                                     radeon_device *rdev);
                                     diff --git
                                     a/drivers/gpu/drm/radeon/radeon_uvd.c
                                     b/drivers/gpu/drm/radeon/radeon_uvd.c
                                     index 6bf55ec..ef5667a 100644
                                     ---
                                     a/drivers/gpu/drm/radeon/radeon_uvd.c
                                     +++
                                     b/drivers/gpu/drm/radeon/radeon_uvd.c
                                     @@ -237,6 +237,51 @@ void
                                     radeon_uvd_force_into_uvd_segment(struct
                                     radeon_bo *rbo)
                                                 rbo->placement.lpfn =
                                     (256 * 1024 * 1024) >> PAGE_SHIFT;
                                          }

                                     +static void
                                     radeon_uvd_fps_clear_events(struct
                                     radeon_device *rdev,
                                     int
                                     idx)
                                     +{
                                     +       struct radeon_uvd_fps *fps
                                     = &rdev->uvd.fps_info[idx];
                                     +       unsigned i;
                                     +
                                     +       fps->timestamp = jiffies_64;
                                     +       fps->event_index = 0;
                                     +       for (i = 0; i <
                                     RADEON_UVD_FPS_EVENTS_MAX; i++)
                                     +               fps->events[i] = 0;
                                     +}
                                     +
                                     +static void
                                     radeon_uvd_fps_note_event(struct
                                     radeon_device *rdev,
                                     int
                                     idx)
                                     +{
                                     +       struct radeon_uvd_fps *fps
                                     = &rdev->uvd.fps_info[idx];
                                     +       uint64_t timestamp =
                                     jiffies_64;
                                     +       unsigned rate = 0;
                                     +
                                     +       uint8_t index =
                                     fps->event_index++;
                                     +       fps->event_index %=
                                     RADEON_UVD_FPS_EVENTS_MAX;
                                     +
                                     +       rate = div64_u64(HZ,
                                     max(timestamp - fps->timestamp,
                                     1ULL));
                                     +
                                     +       fps->timestamp = timestamp;
                                     +       fps->events[index] =
                                     min(rate, 120u);
                                     +}
                                     +
                                     +static unsigned
                                     radeon_uvd_estimate_fps(struct
                                     radeon_device *rdev,
                                     int
                                     idx)
                                     +{
                                     +       struct radeon_uvd_fps *fps
                                     = &rdev->uvd.fps_info[idx];
                                     +       unsigned i, valid = 0,
                                     count = 0;
                                     +
                                     +       for (i = 0; i <
                                     RADEON_UVD_FPS_EVENTS_MAX; i++) {
                                     +               /* We should
                                     ignore zero values */
                                     +               if (fps->events[i]
                                     != 0) {
                                     +                       count +=
                                     fps->events[i];
                                     +                       valid++;
                                     +               }
                                     +       }
                                     +
                                     +       if (valid > 0)
                                     +               return count / valid;
                                     +       else
                                     +               return
                                     RADEON_UVD_DEFAULT_FPS;
                                     +}
                                     +
                                          void
                                     radeon_uvd_free_handles(struct
                                     radeon_device *rdev, struct
                                     drm_file *filp)
                                          {
                                                 int i, r;
                                     @@ -419,8 +464,10 @@ static int
                                     radeon_uvd_cs_msg(struct
                                     radeon_cs_parser
                                     *p, struct radeon_bo *bo,

                                                 /* create or decode,
                                     validate the handle */
                                                 for (i = 0; i <
                                     RADEON_MAX_UVD_HANDLES; ++i) {
                                     -               if
                                     (atomic_read(&p->rdev->uvd.handles[i])
                                     == handle)
                                     +               if
                                     (atomic_read(&p->rdev->uvd.handles[i])
                                     == handle)
                                     {
                                     +
                                      radeon_uvd_fps_note_event(p->rdev, i);
                                                                 return 0;
                                     +               }
                                                 }

                                                 /* handle not found
                                     try to alloc a new one */
                                     @@ -428,6 +475,7 @@ static int
                                     radeon_uvd_cs_msg(struct
                                     radeon_cs_parser
                                     *p, struct radeon_bo *bo,
                                                         if
                                     (!atomic_cmpxchg(&p->rdev->uvd.handles[i],
                                     0,
                                     handle)) {

                                     p->rdev->uvd.filp[i] = p->filp;

                                     p->rdev->uvd.img_size[i] = img_size;
                                     +
                                      radeon_uvd_fps_clear_events(p->rdev,
                                     i);
                                                                 return 0;
                                                         }
                                                 }
                                     @@ -763,7 +811,7 @@ int
                                     radeon_uvd_get_destroy_msg(struct
                                     radeon_device
                                     *rdev, int ring,
                                          static void
                                     radeon_uvd_count_handles(struct
                                     radeon_device *rdev,

                                            unsigned *sd, unsigned *hd)
                                          {
                                     -       unsigned i;
                                     +       unsigned i, fps_rate = 0;

                                                 *sd = 0;
                                                 *hd = 0;
                                     @@ -772,10 +820,13 @@ static void
                                     radeon_uvd_count_handles(struct
                                     radeon_device *rdev,
                                                         if
                                     (!atomic_read(&rdev->uvd.handles[i]))
                                                                 continue;

                                     -               if
                                     (rdev->uvd.img_size[i] >= 720*576)
                                     -                       ++(*hd);
                                     -               else
                                     -                       ++(*sd);
                                     +               fps_rate =
                                     radeon_uvd_estimate_fps(rdev, i);
                                     +
                                     +               if
                                     (rdev->uvd.img_size[i] >= 720*576) {
                                     +                       (*hd) +=
                                     fps_rate > 30 ? 1 : 2;
                                     +               } else {
                                     +                       (*sd) +=
                                     fps_rate > 30 ? 1 : 2;
                                     +               }
                                                 }
                                          }

                                     @@ -805,6 +856,7 @@ void
                                     radeon_uvd_note_usage(struct
                                     radeon_device
                                     *rdev)
                                                 set_clocks &=
                                     schedule_delayed_work(&rdev->uvd.idle_work,

                                     msecs_to_jiffies(UVD_IDLE_TIMEOUT_MS));

                                     +
                                                 if
                                     ((rdev->pm.pm_method ==
                                     PM_METHOD_DPM) &&
                                     rdev->pm.dpm_enabled) {
                                                         unsigned hd =
                                     0, sd = 0;

                                     radeon_uvd_count_handles(rdev,
                                     &sd, &hd);
                                     --
                                     1.9.1





--
Marco Antonio Benatto
Linux user ID:#506236



_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel





[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux