On Mon, Mar 15, 2010 at 07:43:02AM +0000, tip-bot for KOSAKI Motohiro wrote: > Commit-ID: cd3d8031eb4311e516329aee03c79a08333141f1 > Gitweb: http://git.kernel.org/tip/cd3d8031eb4311e516329aee03c79a08333141f1 > Author: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> > AuthorDate: Fri, 12 Mar 2010 16:15:36 +0900 > Committer: Ingo Molnar <mingo@xxxxxxx> > CommitDate: Mon, 15 Mar 2010 08:28:44 +0100 ... > IOW we hope they are not annoyed by this issue ... The change looks ok but I can't reproduce the problem. I'm running on a distro kernel that has NR_CPUS=4096. Glibc has also has a definition of __CPU_SETSIZE (I assume this change was made by the distro but am not certain): sched.c ... #if defined _SCHED_H && !defined __cpu_set_t_defined # define __cpu_set_t_defined /* Size definition for CPU sets. */ # define __CPU_SETSIZE 4096 # define __NCPUBITS (8 * sizeof (__cpu_mask)) Your test program runs ok: % strace t ... sched_getaffinity(0, 512, { ffff, 0, 0, 0, 0, 0, 0, 0 }) = 64 Also note that we've run on IA64 systems with NR_CPUS=4096 for several years w/o hitting any problems. Bottom line. I don't think the change will affect us. > > sched: sched_getaffinity(): Allow less than NR_CPUS length > > [ Note, this commit changes the syscall ABI for > 1024 CPUs systems. ] > > Recently, some distro decided to use NR_CPUS=4096 for mysterious reasons. > Unfortunately, glibc sched interface has the following definition: > > # define __CPU_SETSIZE 1024 > # define __NCPUBITS (8 * sizeof (__cpu_mask)) > typedef unsigned long int __cpu_mask; > typedef struct > { > __cpu_mask __bits[__CPU_SETSIZE / __NCPUBITS]; > } cpu_set_t; > > It mean, if NR_CPUS is bigger than 1024, cpu_set_t makes an > ABI issue ... > > More recently, Sharyathi Nagesh reported following test program makes > misterious syscall failure: > > ----------------------------------------------------------------------- > #define _GNU_SOURCE > #include<stdio.h> > #include<errno.h> > #include<sched.h> > > int main() > { > cpu_set_t set; > if (sched_getaffinity(0, sizeof(cpu_set_t), &set) < 0) > printf("\n Call is failing with:%d", errno); > } > ----------------------------------------------------------------------- > > Because the kernel assumes len argument of sched_getaffinity() is bigger > than NR_CPUS. But now it is not correct. > > Now we are faced with the following annoying dilemma, due to > the limitations of the glibc interface built in years ago: > > (1) if we change glibc's __CPU_SETSIZE definition, we lost > binary compatibility of _all_ application. > > (2) if we don't change it, we also lost binary compatibility of > Sharyathi's use case. > > Then, I would propse to change the rule of the len argument of > sched_getaffinity(). > > Old: > len should be bigger than NR_CPUS > New: > len should be bigger than maximum possible cpu id > > This creates the following behavior: > > (A) In the real 4096 cpus machine, the above test program still > return -EINVAL. > > (B) NR_CPUS=4096 but the machine have less than 1024 cpus (almost > all machines in the world), the above can run successfully. > > Fortunatelly, BIG SGI machine is mainly used for HPC use case. It means > they can rebuild their programs. > > IOW we hope they are not annoyed by this issue ... > > Reported-by: Sharyathi Nagesh <sharyath@xxxxxxxxxx> > Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> > Acked-by: Ulrich Drepper <drepper@xxxxxxxxxx> > Acked-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: Jack Steiner <steiner@xxxxxxx> > Cc: Russ Anderson <rja@xxxxxxx> > Cc: Mike Travis <travis@xxxxxxx> > LKML-Reference: <20100312161316.9520.A69D9226@xxxxxxxxxxxxxx> > Signed-off-by: Ingo Molnar <mingo@xxxxxxx> > --- > kernel/sched.c | 10 +++++++--- > 1 files changed, 7 insertions(+), 3 deletions(-) > > diff --git a/kernel/sched.c b/kernel/sched.c > index 9ab3cd7..6eaef3d 100644 > --- a/kernel/sched.c > +++ b/kernel/sched.c > @@ -4902,7 +4902,9 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len, > int ret; > cpumask_var_t mask; > > - if (len < cpumask_size()) > + if (len < nr_cpu_ids) > + return -EINVAL; > + if (len & (sizeof(unsigned long)-1)) > return -EINVAL; > > if (!alloc_cpumask_var(&mask, GFP_KERNEL)) > @@ -4910,10 +4912,12 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len, > > ret = sched_getaffinity(pid, mask); > if (ret == 0) { > - if (copy_to_user(user_mask_ptr, mask, cpumask_size())) > + int retlen = min(len, cpumask_size()); > + > + if (copy_to_user(user_mask_ptr, mask, retlen)) > ret = -EFAULT; > else > - ret = cpumask_size(); > + ret = retlen; > } > free_cpumask_var(mask); > -- To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
![]() |