Thank you for your comments. On 22/07/21 8:52 pm, Eric W. Biederman wrote:
As stated I think this idea is a non-starter. There is a real problem that there are applications that have a legitimate need to know what cpu resources are available for them to use and we don't have a good interfaces for them to request that information. I think MESOS solved this by passing a MAX_CPUS environment variable, and at least the JVM was modified to use that variable. That said as situations can be a bit more dynamic and fluid having something where an application can look and see what resources are available from it's view of the world seems reasonable. AKA we need something so applications can stop conflating physical cpu resources that are available with cpu resources that are allowed to be used in an application. This might be as simple as implementing a /proc/self/cpus_available file. Without the will to go through find existing open source applications that care and update them so that they will use the new interface I don't think anything will really happen.
From a process granular point of view I believe a /proc/self approach solves this problem at root. However, as you have stated too; applications will now have to look at another interface for the correct information and that could potentially be a challenge.
The problem I see with changing existing interfaces that describe the hardware is that the definition becomes unclear and so different applications can legitimately expect different things, and it would become impossible to implement what is needed correctly.
In our experimentation and survey we found out that container applications which were restricted based on a cgroup restriction - both cpuset or period/quota benefited from coherent information. That was also my understanding with the usage of tools like LXCFS in the userspace. Would you happen to know if there are any applications that expect the full hardware/topology view even though it itself is restricted in its usage?
The problem I see with using cgroup interfaces is that they are not targeted at end user applications and but rather are targeted at the problem of controlling access to a resource. Using them report what is available again gets you into the multiple master problem. Especially as cgroups may not be the only thing in the system controlling access to your resource.
I agree, cgroup is a control interface and should not be used for presenting of information and cgroups may not be the only thing in the system controlling access to the resources. This is where the idea for a different interface really stemmed from. That although there are mechanisms to restrict and control usage, there is no interface that presents information coherently to the userspace
So I really think the only good solution that people won't mind is to go through the applications figure out what information is legitimately needed from an application perspective, and build an interface tailored for applications to get that information. Then applications can be updated to use the new interface, and as the implementation of the system changes the implementation in the kernel can be updated to keep the applications working.
I concur with this approach to build an application first interface. My current frame of reference for the problems come from tools like LXCFS which are built around the existing interfaces to present information and the experiments were designed to quantify those shortcomings. We could definitely use some help in understanding the shortcomings of the current interfaces from people who use these applications. -- Pratik
Eric