Re: [Intel-gfx] [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Kuehling, Felix (2018-12-03 22:55:16)
> 
> On 2018-11-28 4:14 a.m., Joonas Lahtinen wrote:
> > Quoting Ho, Kenny (2018-11-27 17:41:17)
> >> On Tue, Nov 27, 2018 at 4:46 AM Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> wrote:
> >>> I think a more abstract property "% of GPU (processing power)" might
> >>> be a more universal approach. One can then implement that through
> >>> subdividing the resources or timeslicing them, depending on the GPU
> >>> topology.
> >>>
> >>> Leasing 1/8th, 1/4th or 1/2 of the GPU would probably be the most
> >>> applicable to cloud provider usecases, too. At least that's what I
> >>> see done for the CPUs today.
> >> I think there are opportunities to slice the gpu in more than one way (similar to the way it is done for cpu.)  We can potentially frame resources as continuous or discrete.  Percentage definitely fits well for continuous measurements such as time/time slices but I think there are places for discrete units such as core counts as well.
> > I think the ask in return to the early series from Intal was to agree
> > on the variables that could be common to all of DRM subsystem.
> >
> > So we can only choose the lowest common denominator, right?
> >
> > Any core count out of total core count should translate nicely into a
> > fraction, so what would be the problem with percentage amounts?
> How would you handle overcommitment with a percentage? That is, more
> than 100% of the GPU cores assigned to cgroups. Which cgroups end up
> sharing cores would be up to chance.

I see your point. With time-slicing, you really can't overcommit. So would
assume that there would have to be second level of detail provided for
overcommitting (and deciding which cgroups are to share GPU cores).

> If we allow specifying a set of GPU cores, we can be more specific in
> assigning and sharing resources between cgroups.

As Matt outlined in the other reply to this thread, we don't really have
the concept of GPU cores. We do have the command streamers, but the
granularity is bit low.

In your architecture, does it matter which specific cores are shared, or
is it just a question of which specific cgroups would share some cores
in case of overcommit?

If we tack in the priority in addition to the percentage, you could make
a choice to share cores only at an identical priority level only. That'd
mean that in the case of overcommit, you'd aim to keep as many high
priority levels free of overcommit as possible and then for lower
priority cgroups you'd start overcommitting.

Would that even partially address the concern?

Regards, Joonas

> 
> Regards,
>   Felix
> 
> 
> >
> > Regards, Joonas
> >
> >> Regards,
> >> Kenny
> >>
> >>> That combined with the "GPU memory usable" property should be a good
> >>> starting point to start subdividing the GPU resources for multiple
> >>> users.
> >>>
> >>> Regards, Joonas
> >>>
> >>>> Your feedback is highly appreciated.
> >>>>
> >>>> Best Regards,
> >>>> Harish
> >>>>
> >>>>
> >>>>
> >>>> From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> on behalf of Tejun Heo <tj@xxxxxxxxxx>
> >>>> Sent: Tuesday, November 20, 2018 5:30 PM
> >>>> To: Ho, Kenny
> >>>> Cc: cgroups@xxxxxxxxxxxxxxx; intel-gfx@xxxxxxxxxxxxxxxxxxxxx; y2kenny@xxxxxxxxx; amd-gfx@xxxxxxxxxxxxxxxxxxxxx; dri-devel@xxxxxxxxxxxxxxxxxxxxx
> >>>> Subject: Re: [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices
> >>>>
> >>>>
> >>>> Hello,
> >>>>
> >>>> On Tue, Nov 20, 2018 at 10:21:14PM +0000, Ho, Kenny wrote:
> >>>>> By this reply, are you suggesting that vendor specific resources
> >>>>> will never be acceptable to be managed under cgroup?  Let say a user
> >>>> I wouldn't say never but whatever which gets included as a cgroup
> >>>> controller should have clearly defined resource abstractions and the
> >>>> control schemes around them including support for delegation.  AFAICS,
> >>>> gpu side still seems to have a long way to go (and it's not clear
> >>>> whether that's somewhere it will or needs to end up).
> >>>>
> >>>>> want to have similar functionality as what cgroup is offering but to
> >>>>> manage vendor specific resources, what would you suggest as a
> >>>>> solution?  When you say keeping vendor specific resource regulation
> >>>>> inside drm or specific drivers, do you mean we should replicate the
> >>>>> cgroup infrastructure there or do you mean either drm or specific
> >>>>> driver should query existing hierarchy (such as device or perhaps
> >>>>> cpu) for the process organization information?
> >>>>>
> >>>>> To put the questions in more concrete terms, let say a user wants to
> >>>>> expose certain part of a gpu to a particular cgroup similar to the
> >>>>> way selective cpu cores are exposed to a cgroup via cpuset, how
> >>>>> should we go about enabling such functionality?
> >>>> Do what the intel driver or bpf is doing?  It's not difficult to hook
> >>>> into cgroup for identification purposes.
> >>>>
> >>>> Thanks.
> >>>>
> >>>> --
> >>>> tejun
> >>>> _______________________________________________
> >>>> amd-gfx mailing list
> >>>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx
> >>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> >>>>
> >>>>
> >>>> amd-gfx Info Page - freedesktop.org
> >>>> lists.freedesktop.org
> >>>> To see the collection of prior postings to the list, visit the amd-gfx Archives.. Using amd-gfx: To post a message to all the list members, send email to amd-gfx@xxxxxxxxxxxxxxxxxxxxx. You can subscribe to the list, or change your existing subscription, in the sections below.
> >>>>
> >>>> _______________________________________________
> >>>> Intel-gfx mailing list
> >>>> Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
> >>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx@xxxxxxxxxxxxxxxxxxxxx
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux