On 27/07/2023 21:58, Alex Deucher wrote:
We have a number of customers using these stats, but the issue that
keeps coming up is the CPU overhead to gather them, particularly on
systems with hundreds of processes using the GPU. Has anyone given
any thought to having a single interface to get this information for
the entire GPU in one place?
Could I have a framed told you so certificate please? :D
Well at least it depends on how much CPU overhead would your users be
happy to eliminate and how much to keep. So maybe no need for that
certificate just yet.
I was raising the issue of exponential complexity of walking "total
number of processes" x "total number of file descriptors" on a system
from the inception of fdinfo.
So for that issue the idea was to perhaps expose a list of pids with DRM
fds open somewhere, maybe sysfs.
That would eliminate walking _all_ processes and trying to parse any
their file descriptor.
But it would still require walking all file descriptors belonging to
processes with DRM fds open.
If that wouldn't be enough of a saving for your users then no, I am not
aware it was discussed. Assuming at least you were suggesting something
like "read all fdinfo for all clients" in one blob. Also in sysfs? I
think it would be doable by walking the dev->filelist and invoking
drm_show_fdinfo() on them.
Out of curiosity are they using the fdinfo parsing code from IGT or
something of their own?
Regards,
Tvrtko