On Mon, Nov 27, 2017 at 7:49 PM, Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: > Paul Eggert <eggert@xxxxxxxxxxx> writes: > >> On 11/27/2017 09:00 AM, Arnd Bergmann wrote: >>> b) Extend the approach taken by the x32 ABI, and use the 64-bit >>> native structure layout for rusage on all architectures with new >>> system calls that is otherwise compatible. A possible problem here >>> is that we end up with incompatible definitions of rusage between >>> /usr/include/linux/resource.h and /usr/include/bits/resource.h >>> >>> c) Change the definition of struct rusage to be independent of >>> time_t. This is the easiest change, as it does not involve new system >>> call entry points, but it has the risk of introducing compile-time >>> incompatibilities with user space sources that rely on the type >>> of ru_utime and ru_stime. >>> >>> I'm picking approch c) for its simplicity, but I'd like to hear from >>> others whether they would prefer a different approach. >> >> (c) would break programs like GNU Emacs, which copy ru_utime and ru_stime >> members into struct timeval variables. Right. I think I originally had the workaround to have glibc convert between its own structure and the kernel structure in mind, but then ended up not including that in the text above. I was going back and forth on whether it would be needed or not. >> All in all, (b) sounds like it would be better for programs using glibc, as it's >> more compatible with what POSIX apps expect. Though I'm not sure what problems >> are meant by "possible ... incompatible definitions"; perhaps you could >> elaborate. I meant that you might have an application that includes linux/resource.h instead of sys/resource.h but calls the glibc function, or one that includes sys/resource.h and invokes the system call directly. > getrusage is posix and I believe the use of struct timeval is posix as > well. > > So getrusage(3) the libc definition and that defintion must struct > timeval or the implementation will be non-conforming and it won't be > just emacs we need to worry about. > > The practical question is what do we provide to userspace so that it can > implement a conforming getrusage? > > A 32bit time_t based struct timeval is good for durations up to 136 years > or so. Which strongly suggests the range is large enough, except for > some crazy massively multi-threaded application. And anything off the > charts cpu hungry at this point I expect will be 64bit. > > It is possible to get a 128 way system with one thread on each core and > consume 100% of the core for a bit over a year to max out getrusage. So > I do think in the long run we care about increasing the size of time_t > here. Last I checked applications doing things like that were 64bit in > the year 2000. Agreed, this was also a calculation I did. > Given that userspace is going to be seeing the larger struct rusage in > any event my inclination for long term maintainability would be to > introduce the new syscall and have the current one called oldgetrusage > on 32bit architectures. Then we won't have to worry about what weird > things glibc will do when translating the data, and we can handle > applications with crazy (but possible) runtimes. Which inclines me to > (b) as well. This would actually be the same thing we do for most other syscalls, regarding the naming, it would become compat_sys_getrusage() and share the implementation between native 32-bit mode and compat mode on 64-bit architectures, while sys_getrusage becomes the function that deals with the 64-bit layout, and would have the same binary format on both 32-bit and 64-bit native ABIs. Unfortunately, this opens a new question, as the structure is currently defined by glibc as: /* Structure which says how much of each resource has been used. */ /* The purpose of all the unions is to have the kernel-compatible layout while keeping the API type as 'long int', and among machines where __syscall_slong_t is not 'long int', this only does the right thing for little-endian ones, like x32. */ struct rusage { /* Total amount of user time used. */ struct timeval ru_utime; /* Total amount of system time used. */ struct timeval ru_stime; /* Maximum resident set size (in kilobytes). */ __extension__ union { long int ru_maxrss; __syscall_slong_t __ru_maxrss_word; }; /* Amount of sharing of text segment memory with other processes (kilobyte-seconds). */ /* Maximum resident set size (in kilobytes). */ __extension__ union { long int ru_ixrss; __syscall_slong_t __ru_ixrss_word; }; ... }; Here, I guess we have to replace __syscall_slong_t with an 'rusage' specific type that has the same length as time_t, but is independent of __syscall_slong_t, which is still 32-bit for most 32-bit architectures. How would we do the big-endian version of that though? One argument for using c) plus the emulation in glibc is that glibc has to do emulation anyway, to allow running user space with 64-bit time_t on older kernels that don't have the new getrusage system call. > As for (a) does anyone have a need for process acounting at nsec > granularity? Unless we can get that for free that just seems like > overpromising and a waist to have so much fine granularity. The kernel does everything in nanoseconds, so we always spend a few cycles (a lot of cycles on some of the very low-end architectures) on dividing it by 1000. Moving the division operation to user space is essentially free, and using the nanoseconds instead of microseconds might be slightly cheaper. I don't think anyone really needs it though. Arnd