On Thu, Nov 14, 2019 at 1:38 AM Christian Brauner <christian.brauner@xxxxxxxxxx> wrote: > On Wed, Nov 13, 2019 at 11:02:12AM +0100, Arnd Bergmann wrote: > > On Tue, Nov 12, 2019 at 10:09 PM Cyrill Gorcunov <gorcunov@xxxxxxxxx> wrote: > > > > > > On Fri, Nov 08, 2019 at 10:12:10PM +0100, Arnd Bergmann wrote: > > > > > > --- > > > > Question: should we also rename 'struct rusage' into 'struct __kernel_rusage' > > > > here, to make them completely unambiguous? > > > > > > The patch looks ok to me. I must confess I looked into rusage long ago > > > so __kernel_timespec type used in uapi made me nervious at first, > > > but then i found that we've this type defined in time_types.h uapi > > > so userspace should be safe. I also like the idea of __kernel_rusage > > > but definitely on top of the series. > > > > There are clearly too many time types at the moment, but I'm in the > > process of throwing out the ones we no longer need now. > > > > I do have a number patches implementing other variants for the syscall, > > and I suppose that if we end up adding __kernel_rusage, that would > > have to go with a set of syscalls using 64-bit seconds/nanoseconds > > rather than the old 32/64 microseconds. I don't know what other > > changes remain that anyone would want from sys_waitid() now that > > it does support pidfd. > > > > If there is still a need for a new waitid() replacement, that should take > > that new __kernel_rusage I think, but until then I hope we are fine > > with today's getrusage+waitid based on the current struct rusage. > > Note, that glibc does _not_ expose the rusage argument, i.e. most of > userspace is unaware that waitid() does allow you to get rusage > information. So users first need to know that waitid() has an rusage > argument and then need to call the waitid() syscall directly. On architectures that don't have a wait4 syscall (riscv32 for now), glibc uses waitid to implement wait4 and wait3. > > BSD has wait6() to return separate rusage structures for 'self' and > > 'children', but I could not find any application (using the freebsd > > sources and debian code search) that actually uses that information, > > so there might not be any demand for that. > > Speaking specifically for Linux now, I think that rusage does not > actually expose the information most relevant users are interested in. > On Linux nowadays it is _way_ more interesting to retrieve stats > relative to the cgroup the task lived in etc. > Doing a git grep -i rusage in the systemd source code shows that rusage > is used _nowhere_. And I consider an init system to be the most likely > candidate to be interested in rusage. I looked at a couple of implementations of time(1), this is one example that sometimes uses wait3(), though other implementations just call getrusage() in the parent process before the fork/exec. None of them actually seem to report better than millisecond resolution, so there is not a strict reason to do a timespec replacement for these. Arnd