Hi Tejun, Thanks for looking into this. I can definitely help where I can and I am sure other experts will jump in if I start misrepresenting the reality :) (as Daniel already have done.) Regarding your points, my understanding is that there isn't really a TTM vs GEM situation anymore (there is an lwn.net article about that, but it is more than a decade old.) I believe GEM is the common interface at this point and more and more features are being refactored into it. For example, AMD's driver uses TTM internally but things are exposed via the GEM interface. This GEM resource is actually the single number resource you just referred to. A GEM buffer (the drm.buffer.* resources) can be backed by VRAM, or system memory or other type of memory. The more fine grain control is the drm.memory.* resources which still need more discussion. (As some of the functionalities in TTM are being refactored into the GEM level. I have seen some patches that make TTM a subclass of GEM.) This RFC can be grouped into 3 areas and they are fairly independent so they can be reviewed separately: high level device memory control (buffer.*), fine grain memory control and bandwidth (memory.*) and compute resources (lgpu.*) I think the memory.* resources are the most controversial part but I think it's still needed. Perhaps an analogy may help. For a system, we have CPUs and memory. And within memory, it can be backed by RAM or swap. For GPU, each device can have LGPUs and buffers. And within the buffers, it can be backed by VRAM, or system RAM or even swap. As for setting the right amount, I think that's where the profiling aspect of the *.stats comes in. And while one can't necessary buy more VRAM, it is still a useful knob to adjust if the intention is to pack more work into a GPU device with predictable performance. This research on various GPU workload may be of interest: A Taxonomy of GPGPU Performance Scaling http://www.computermachines.org/joe/posters/iiswc2015_taxonomy.pdf http://www.computermachines.org/joe/publications/pdfs/iiswc2015_taxonomy.pdf (summary: GPU workload can be memory bound or compute bound. So it's possible to pack different workload together to improve utilization.) Regards, Kenny On Tue, Sep 3, 2019 at 2:50 PM Tejun Heo <tj@xxxxxxxxxx> wrote: > > Hello, Daniel. > > On Tue, Sep 03, 2019 at 09:55:50AM +0200, Daniel Vetter wrote: > > > * While breaking up and applying control to different types of > > > internal objects may seem attractive to folks who work day in and > > > day out with the subsystem, they aren't all that useful to users and > > > the siloed controls are likely to make the whole mechanism a lot > > > less useful. We had the same problem with cgroup1 memcg - putting > > > control of different uses of memory under separate knobs. It made > > > the whole thing pretty useless. e.g. if you constrain all knobs > > > tight enough to control the overall usage, overall utilization > > > suffers, but if you don't, you really don't have control over actual > > > usage. For memcg, what has to be allocated and controlled is > > > physical memory, no matter how they're used. It's not like you can > > > go buy more "socket" memory. At least from the looks of it, I'm > > > afraid gpu controller is repeating the same mistakes. > > > > We do have quite a pile of different memories and ranges, so I don't > > thinkt we're doing the same mistake here. But it is maybe a bit too > > I see. One thing which caught my eyes was the system memory control. > Shouldn't that be controlled by memcg? Is there something special > about system memory used by gpus? > > > complicated, and exposes stuff that most users really don't care about. > > Could be from me not knowing much about gpus but definitely looks too > complex to me. I don't see how users would be able to alloate, vram, > system memory and GART with reasonable accuracy. memcg on cgroup2 > deals with just single number and that's already plenty challenging. > > Thanks. > > -- > tejun _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx