Re: ipc-msg broken again on 3.11-rc7?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

[forgot to cc everyone, thus I'll summarize some mails...]
On 09/02/2013 06:58 AM, Vineet Gupta wrote:
On 08/31/2013 11:20 PM, Linus Torvalds wrote:
Vineet, actual patch for what Davidlohr suggests attached. Can you try it?

              Linus
Apologies for late in getting back to this - I was away from my computer for a bit.

Unfortunately, with a quick test, this patch doesn't help.
FWIW, this is latest mainline (.config attached).

Let me know what diagnostics I can add to help with this.

msgctl08 is a bulk message send/receive test. I had to look at it once before, then it was a broken hardware:
https://lkml.org/lkml/2008/6/12/365
This can be ruled out, because it works with 3.10.

msgctl08 uses pairs of threads: one thread does msgsnd(), the other one msgrcv(). There is no synchronization, i.e. the msgsnd() can race ahead until the kernel buffer is full and then a block with msgrcv() follows or it could be pairs of alternating msgsnd()/msgrcv() operations. No special features are used: each pair of threads has it's own message queues, all messages have type=1.

Vineet ran strace - and just before the signal from killing msgctl08, there are only msgsnd()/msgrcv() calls.
Vineet:
a) could you run strace tomorrow again, with '-ttt' as an additional option? I don't see where exactly it hangs.
b) Could you check that it is not just a performance regression?
    Does ./msgctl08 1000 16 hang, too?

In ipc/msg.c, I haven't seen any obvious reason why it should hang.
The only race I spotted so far is this one:
      for (;;) {
                struct msg_sender s;

                err = -EACCES;
                if (ipcperms(ns, &msq->q_perm, S_IWUGO))
                        goto out_unlock1;

                err = security_msg_queue_msgsnd(msq, msg, msgflg);
                if (err)
                        goto out_unlock1;

                if (msgsz + msq->q_cbytes <= msq->q_qbytes &&
                                1 + msq->q_qnum <= msq->q_qbytes) {
                        break;
                }

[snip]
        if (!pipelined_send(msq, msg)) {
                /* no one is waiting for this message, enqueue it */
                list_add_tail(&msg->m_list, &msq->q_messages);
                msq->q_cbytes += msgsz;
                msq->q_qnum++;
                atomic_add(msgsz, &ns->msg_bytes);

The access to msq->q_cbytes is not protected. Thus two parallel msgsnd() calls could succeed, even if both together brings the queue length above the limit. But it can't explain why 3.11-rc7 hangs: As explained above, msgctl08 uses one queue for each thread pair.

--
    Manfred

--
To unsubscribe from this list: send the line "unsubscribe linux-next" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Linux USB Development]     [Yosemite News]     [Linux SCSI]

  Powered by Linux