We've hit an issue where our 32-bit applications, when running on a 64-bit kernel, using poll() and passing in a value of -1 for the timeout return after ~49 days (2^32 msec). Instead of waiting indefinitely as it is stated they should. Reproducing the issue is trivial. I've instrumented the kernel and found we are hitting the case where poll() believes we've passed in a positive number and thus creates a timespec, etc. Currently poll() is defined in userspace as: int poll(struct pollfd *ufds, nfds_t nfds, int timeout); but in the kernel timeout is of type long. I can think of a few ways to solve this. One, which is the patch I've attached, is to change the type of timeout to int in the kernel. I'm not certain the ramifications this may have since it's changing a syscall's arguments which may be a big no-no :) Another way I am proposing is by bounds checking. Currently we do the following: if (timeout_msecs >= 0) { to = &end_time; poll_select_set_timeout(to, timeout_msecs / MSEC_PER_SEC, NSEC_PER_MSEC * (timeout_msecs % MSEC_PER_SEC)); } We could add an upper bound on timeout_msecs to say < 0xffffffff. I'm not sure if either is acceptable though. Josh
commit 7f2c4ee58b66f51b5e2c79fc42db84ca4d6c0a18 Author: Josh Hunt <johunt@xxxxxxxxxx> Date: Mon Feb 6 15:21:04 2012 -0800 poll() in 32-bit applications does not handle timeout of -1 properly on 64-bit kernels We have observed our 32-bit applications running on 64-bit kernels which pass a value of -1 to poll()'s timeout argument do not wait infinitely as it should. Instead it winds up waiting ~49 days which is approx the # of msecs in 2^32 (or 32-bit negative 1.) With this patch both 32 and 64 bit applications now support handling of -1 as an argument for timeout. Signed-off-by: Josh Hunt <johunt@xxxxxxxxxx> diff --git a/fs/select.c b/fs/select.c index d33418f..e782258 100644 --- a/fs/select.c +++ b/fs/select.c @@ -912,7 +912,7 @@ static long do_restart_poll(struct restart_block *restart_block) } SYSCALL_DEFINE3(poll, struct pollfd __user *, ufds, unsigned int, nfds, - long, timeout_msecs) + int, timeout_msecs) { struct timespec end_time, *to = NULL; int ret; diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 515669f..8ec1153 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -624,7 +624,7 @@ asmlinkage long sys_socketpair(int, int, int, int __user *); asmlinkage long sys_socketcall(int call, unsigned long __user *args); asmlinkage long sys_listen(int, int); asmlinkage long sys_poll(struct pollfd __user *ufds, unsigned int nfds, - long timeout); + int timeout); asmlinkage long sys_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user *exp, struct timeval __user *tvp); asmlinkage long sys_old_select(struct sel_arg_struct __user *arg);