Am 16.05.19 um 14:28 schrieb Daniel Vetter: > [CAUTION: External Email] > > On Thu, May 16, 2019 at 09:25:31AM +0200, Christian König wrote: >> Am 16.05.19 um 09:16 schrieb Koenig, Christian: >>> Am 16.05.19 um 04:29 schrieb Kenny Ho: >>>> [CAUTION: External Email] >>>> >>>> On Wed, May 15, 2019 at 5:26 PM Welty, Brian <brian.welty@xxxxxxxxx> wrote: >>>>> On 5/9/2019 2:04 PM, Kenny Ho wrote: >>>>>> There are four control file types, >>>>>> stats (ro) - display current measured values for a resource >>>>>> max (rw) - limits for a resource >>>>>> default (ro, root cgroup only) - default values for a resource >>>>>> help (ro, root cgroup only) - help string for a resource >>>>>> >>>>>> Each file is multi-lined with one entry/line per drm device. >>>>> Multi-line is correct for multiple devices, but I believe you need >>>>> to use a KEY to denote device for both your set and get routines. >>>>> I didn't see your set functions reading a key, or the get functions >>>>> printing the key in output. >>>>> cgroups-v2 conventions mention using KEY of major:minor, but I think >>>>> you can use drm_minor as key? >>>> Given this controller is specific to the drm kernel subsystem which >>>> uses minor to identify drm device, >>> Wait a second, using the DRM minor is a good idea in the first place. >> Well that should have read "is not a good idea".. > What else should we use? Well what does for example udev uses to identify a device? >> Christian. >> >>> I have a test system with a Vega10 and a Vega20. Which device gets which >>> minor is not stable, but rather defined by the scan order of the PCIe bus. >>> >>> Normally the scan order is always the same, but adding or removing >>> devices or delaying things just a little bit during init is enough to >>> change this. >>> >>> We need something like the Linux sysfs location or similar to have a >>> stable implementation. > You can go from sysfs location to drm class directory (in sysfs) and back. > That means if you care you need to walk sysfs yourself a bit, but using > the drm minor isn't a blocker itself. Yeah, agreed that userspace could do this. But I think if there is an of hand alternative we should use this instead. > One downside with the drm minor is that it's pretty good nonsense once you > have more than 64 gpus though, due to how we space render and legacy nodes > in the minor ids :-) Ok, another good reason to at least not use the minor=linenum approach. Christian. > -Daniel >>> Regards, >>> Christian. >>> >>>> I don't see a need to complicate >>>> the interfaces more by having major and a key. As you can see in the >>>> examples below, the drm device minor corresponds to the line number. >>>> I am not sure how strict cgroup upstream is about the convention but I >>>> am hoping there are flexibility here to allow for what I have >>>> implemented. There are a couple of other things I have done that is >>>> not described in the convention: 1) inclusion of read-only *.help file >>>> at the root cgroup, 2) use read-only (which I can potentially make rw) >>>> *.default file instead of having a default entries (since the default >>>> can be different for different devices) inside the control files (this >>>> way, the resetting of cgroup values for all the drm devices, can be >>>> done by a simple 'cp'.) >>>> >>>>>> Usage examples: >>>>>> // set limit for card1 to 1GB >>>>>> sed -i '2s/.*/1073741824/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max >>>>>> >>>>>> // set limit for card0 to 512MB >>>>>> sed -i '1s/.*/536870912/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max >>>>>> /** @file drm_gem.c >>>>>> @@ -154,6 +156,9 @@ void drm_gem_private_object_init(struct drm_device *dev, >>>>>> obj->handle_count = 0; >>>>>> obj->size = size; >>>>>> drm_vma_node_reset(&obj->vma_node); >>>>>> + >>>>>> + obj->drmcgrp = get_drmcgrp(current); >>>>>> + drmcgrp_chg_bo_alloc(obj->drmcgrp, dev, size); >>>>> Why do the charging here? >>>>> There is no backing store yet for the buffer, so this is really tracking something akin to allowed virtual memory for GEM objects? >>>>> Is this really useful for an administrator to control? >>>>> Isn't the resource we want to control actually the physical backing store? >>>> That's correct. This is just the first level of control since the >>>> backing store can be backed by different type of memory. I am in the >>>> process of adding at least two more resources. Stay tuned. I am >>>> doing the charge here to enforce the idea of "creator is deemed owner" >>>> at a place where the code is shared by all (the init function.) >>>> >>>>>> + while (i <= max_minor && limits != NULL) { >>>>>> + sval = strsep(&limits, "\n"); >>>>>> + rc = kstrtoll(sval, 0, &val); >>>>> Input should be "KEY VALUE", so KEY will determine device to apply this to. >>>>> Also, per cgroups-v2 documentation of limits, I believe need to parse and handle the special "max" input value. >>>>> >>>>> parse_resources() in rdma controller is example for both of above. >>>> Please see my previous reply for the rationale of my hope to not need >>>> a key. I can certainly add handling of "max" and "default". >>>> >>>> >>>>>> +void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev, >>>>>> + size_t size) >>>>> Shouldn't this return an error and be implemented with same semantics as the >>>>> try_charge() functions of other controllers? >>>>> Below will allow stats_total_allocated to overrun limits_total_allocated. >>>> This is because I am charging the buffer at the init of the buffer >>>> which does not fail so the "try" (drmcgrp_bo_can_allocate) is separate >>>> and placed earlier and nearer other condition where gem object >>>> allocation may fail. In other words, there are multiple possibilities >>>> for which gem allocation may fail (cgroup limit being one of them) and >>>> satisfying cgroup limit does not mean a charge is needed. I can >>>> certainly combine the two functions to have an additional try_charge >>>> semantic as well if that is really needed. >>>> >>>> Regards, >>>> Kenny >>> _______________________________________________ >>> amd-gfx mailing list >>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >> _______________________________________________ >> dri-devel mailing list >> dri-devel@xxxxxxxxxxxxxxxxxxxxx >> https://lists.freedesktop.org/mailman/listinfo/dri-devel > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch