On Tue, 2013-06-25 at 23:41 +0200, Sedat Dilek wrote: > On Tue, Jun 25, 2013 at 10:33 PM, Davidlohr Bueso > <davidlohr.bueso@xxxxxx> wrote: > > On Tue, 2013-06-25 at 18:10 +0200, Sedat Dilek wrote: > > [...] > > > >> I did some more testing with Linux-Testing-Project (release: > >> ltp-full-20130503) and next-20130624 (Monday) which has still the > >> issue, here. > >> > >> If I revert the mentioned two commits from my local > >> revert-ipc-next20130624-5089fd1c6a6a-ab9efc2d0db5 GIT repo, everything > >> is fine. > >> > >> I have tested the LTP ***IPC*** and ***SYSCALLS*** testcases. > >> > >> root# ./runltp -f ipc > >> > >> root# ./runltp -f syscalls > > > > These are nice test cases! > > > > So I was able to reproduce the issue with LTP and manually running > > msgctl08. We seemed to be racing at find_msg(), so take to q_perm lock > > before calling it. The following changes fixes the issue and passes all > > 'runltp -f syscall' tests, could you give it a try? > > > > Cool, that fixes the issues here. > > Building with fakeroot & make deb-pkg is now OK, again. > > The syscalls/msgctl08 test-case ran successfully! Andrew, could you pick this one up? I've made the patch on top of 3.10.0-rc7-next-20130625 Thanks. Davidlohr 8<--------------------------------- From: Davidlohr Bueso <davidlohr.bueso@xxxxxx> Subject: [PATCH] ipc,msq: fix race in msgrcv(2) Sedat reported the following issue when building the latest linux-next: Building via 'make deb-pkg' with fakeroot fails here like this: make: *** [deb-pkg] Terminated /usr/bin/fakeroot: line 181: 2386 Terminated FAKEROOTKEY=$FAKEROOTKEY LD_LIBRARY_PATH="$PATHS" LD_PRELOAD="$LIB" "$@" semop(1): encountered an error: Identifier removed semop(2): encountered an error: Invalid argument semop(1): encountered an error: Identifier removed semop(1): encountered an error: Identifier removed semop(1): encountered an error: Invalid argument semop(1): encountered an error: Invalid argument semop(1): encountered an error: Invalid argument The issue was caused by a race in find_msg(), so acquire the q_perm.lock before calling the function. This also broke some LTP test cases: <<<test_start>>> tag=msgctl08 stime=1372174954 cmdline="msgctl08" contacts="" analysis=exit <<<test_output>>> msgctl08 0 TWARN : Verify error in child 0, *buf = 28, val = 27, size = 8 msgctl08 1 TFAIL : in child 0 read # = 73,key = 127 msgctl08 0 TWARN : Verify error in child 3, *buf = ffffff8a, val = ffffff89, size = 52 msgctl08 1 TFAIL : in child 3 read # = 157,key = 189 msgctl08 0 TWARN : Verify error in child 2, *buf = ffffff87, val = ffffff86, size = 71 msgctl08 1 TFAIL : in child 2 read # = 15954,key = 3e86 msgctl08 0 TWARN : Verify error in child 12, *buf = ffffffa9, val = ffffffa8, size = 22 msgctl08 1 TFAIL : in child 12 read # = 12904,key = 32a8 msgctl08 0 TWARN : Verify error in child 13, *buf = 36, val = 35, size = 27 ... Also update a comment referring to ipc_lock_by_ptr(), which has already been deleted and no longer applies to this context. Reported-and-tested-by: Sedat Dilek <sedat.dilek@xxxxxxxxx> Signed-off-by: Davidlohr Bueso <davidlohr.bueso@xxxxxx> --- ipc/msg.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/ipc/msg.c b/ipc/msg.c index a1cf70e..bd60d7e 100644 --- a/ipc/msg.c +++ b/ipc/msg.c @@ -895,6 +895,7 @@ long do_msgrcv(int msqid, void __user *buf, size_t bufsz, long msgtyp, int msgfl if (ipcperms(ns, &msq->q_perm, S_IRUGO)) goto out_unlock1; + ipc_lock_object(&msq->q_perm); msg = find_msg(msq, &msgtyp, mode); if (!IS_ERR(msg)) { /* @@ -903,7 +904,7 @@ long do_msgrcv(int msqid, void __user *buf, size_t bufsz, long msgtyp, int msgfl */ if ((bufsz < msg->m_ts) && !(msgflg & MSG_NOERROR)) { msg = ERR_PTR(-E2BIG); - goto out_unlock1; + goto out_unlock0; } /* * If we are copying, then do not unlink message and do @@ -911,10 +912,9 @@ long do_msgrcv(int msqid, void __user *buf, size_t bufsz, long msgtyp, int msgfl */ if (msgflg & MSG_COPY) { msg = copy_msg(msg, copy); - goto out_unlock1; + goto out_unlock0; } - ipc_lock_object(&msq->q_perm); list_del(&msg->m_list); msq->q_qnum--; msq->q_rtime = get_seconds(); @@ -930,10 +930,9 @@ long do_msgrcv(int msqid, void __user *buf, size_t bufsz, long msgtyp, int msgfl /* No message waiting. Wait for a message */ if (msgflg & IPC_NOWAIT) { msg = ERR_PTR(-ENOMSG); - goto out_unlock1; + goto out_unlock0; } - ipc_lock_object(&msq->q_perm); list_add_tail(&msr_d.r_list, &msq->q_receivers); msr_d.r_tsk = current; msr_d.r_msgtype = msgtyp; @@ -957,7 +956,7 @@ long do_msgrcv(int msqid, void __user *buf, size_t bufsz, long msgtyp, int msgfl * Prior to destruction, expunge_all(-EIRDM) changes r_msg. * Thus if r_msg is -EAGAIN, then the queue not yet destroyed. * rcu_read_lock() prevents preemption between reading r_msg - * and the spin_lock() inside ipc_lock_by_ptr(). + * and acquiring the q_perm.lock in ipc_lock_object(). */ rcu_read_lock(); -- 1.7.11.7 -- To unsubscribe from this list: send the line "unsubscribe linux-next" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html