Hi. I've got a problem where in the cx18 driver, the wake_up() call doesn't seem to be waking up a process or work queue thread that tried to wait on a wait queue. While doing some investigation, I found I was unable to understand the conditions under which try_to_wake_up() will return success or not. Can anyone give short summary on the conditions under which it does not? FYI: $ uname -a Linux palomino.walls.org 2.6.27.9-159.fc10.x86_64 #1 SMP Tue Dec 16 14:47:52 EST 2008 x86_64 x86_64 x86_64 GNU/Linux Also, just to provide some context on the larger problem, here's what's supposed to happen in the cx18 driver: 1. A work queue thread or read() call needs to send a command to the CX23418 using the cx18_api_call() function 2. It fills out a mailbox with a command for the CX23418 3. It prepares to wait, just in case a wait is needed 4. A SW1 interrupt is sent to the CX23418 telling it a mailbox is ready 5. The ack filed in the mailbox, a PCI MMIO location, is checked to see if the mailbox was ack'ed already 6. If not, schedule_timeout() for up to 10 msecs (a near eternity...) 7. Clean up the wait and move on In response to the SW1 interrupt in step 4 above, the CX23418 processes the command mailbox, fills out the "ack" field in the mailbox with a good value, and sends a SW2 ack interrupt back. The cx18_irq_hander() calls wake_up() in response to this. However, I am running across occasions where the maximum timeout has expired, and the mailbox has been ack'ed. It's as if the wake_up() didn't work. Here's a sample from the log: Mar 31 23:36:56 palomino kernel: cx18-0: irq: sending interrupt SW1: 8 to send CX18_CPU_DE_SET_MDL Mar 31 23:36:56 palomino kernel: cx18-0: api: scheduling to wait for ack of CX18_CPU_DE_SET_MDL: req 51267 ack 51266, pid 21092, state 2 Mar 31 23:36:56 palomino kernel: cx18-0: irq: received interrupts SW1: 0 SW2: 8 HW2: 0 Mar 31 23:36:56 palomino kernel: cx18-0: irq: Wake up initiated on pid 21092 in state 2 Mar 31 23:36:56 palomino kernel: cx18-0: irq: Wake up succeeded on pid 21092, state 2 -> 0 Mar 31 23:36:56 palomino kernel: cx18-0: api: done wait for ack of CX18_CPU_DE_SET_MDL: req 51267 ack 51267, current pid 21092, current state 0, state 0 Mar 31 23:36:56 palomino kernel: cx18-0: warning: failed to be awakened upon RPU acknowledgment sending CX18_CPU_DE_SET_MDL; timed out waiting 10 msecs A wait of 10 msecs is not impossible I guess. However, they should be much less frequent that what I am seeing in my logs, given that this is an ATSC video stream capture. Here's the highlights from a patched cx18 driver that I have in place to try and see what's going on: static int cx18_autoremove_wake_function(wait_queue_t *w, unsigned mode, int sync, void *key) { struct cx18 *cx = key; struct task_struct *t = w->private; long old_state; int ret; old_state = t->state; CX18_DEBUG_HI_IRQ("Wake up initiated on pid %d in state %lx\n", task_pid_nr(t), old_state); ret = default_wake_function(w, mode, sync, NULL); if (ret) { CX18_DEBUG_HI_IRQ("Wake up succeeded on pid %d, " "state %lx -> %lx\n", task_pid_nr(t), old_state, t->state); list_del_init(&w->task_list); } else { CX18_DEBUG_WARN("Wake up failed on pid %d, " "state %lx -> %lx\n", task_pid_nr(t), old_state, t->state); } return ret; } static int cx18_api_call(struct cx18 *cx, u32 cmd, int args, u32 data[]) { u32 state, irq, req, ack, err; struct cx18_mailbox __iomem *mb; wait_queue_head_t *waitq; struct mutex *mb_lock; long int timeout, ret; int i; wait_queue_t w; ... /* Setup pointers and values we need */ waitq = &cx->mb_cpu_waitq; mb_lock = &cx->epu2cpu_mb_lock; irq = IRQ_EPU_TO_CPU; mb = &cx->scb->epu2cpu_mb; ... /* Acquire the mutex for the mailbox, and put a request in it */ /* blah blah blah */ ... timeout = msecs_to_jiffies(10); CX18_DEBUG_HI_IRQ("sending interrupt SW1: %x to send %s\n", irq, info->name); ret = timeout; /* setup for a wait */ init_wait(&w); w.func = cx18_autoremove_wake_function; prepare_to_wait(waitq, &w, TASK_UNINTERRUPTIBLE); /* Tell the device our request is in it's mailbox */ cx18_write_reg_expect(cx, irq, SW1_INT_SET, irq, irq); /* Did it ack our request already? */ ack = cx18_readl(cx, &mb->ack); if (req != ack) { CX18_DEBUG_HI_API("scheduling to wait for ack of %s: req %u " "ack %u, pid %u, state %lx\n", info->name, req, ack, task_pid_nr(w.private), ((struct task_struct *)(w.private))->state); /* Go to sleep. Wake up when the ack IRQ comes in or timeout */ ret = schedule_timeout(timeout); ack = cx18_readl(cx, &mb->ack); CX18_DEBUG_HI_API("done wait for ack of %s: req %u " "ack %u, current pid %u, current state %lx, " "state %lx\n", info->name, req, ack, task_pid_nr(current), current->state, ((struct task_struct *)(w.private))->state); } finish_wait(waitq, &w); if (req != ack) { /* Release mutex, Log stuff */ return -EINVAL; } if (ret == 0) CX18_DEBUG_WARN("failed to be awakened upon RPU acknowledgment " "sending %s; timed out waiting %d msecs\n", info->name, jiffies_to_msecs(timeout)); ... /* Unlock mutex, do other stuff */ return 0; } /* A simplified version of the IRQ handler for this ML post */ irqreturn_t cx18_irq_handler(int irq, void *dev_id) { struct cx18 *cx = (struct cx18 *)dev_id; u32 sw1, sw2, hw2; ... sw2 = cx18_read_reg(cx, SW2_INT_STATUS) & cx->sw2_irq_mask; ... if (sw2) cx18_write_reg_expect(cx, sw2, SW2_INT_STATUS, ~sw2, sw2); ... if (sw1 || sw2 || hw2) CX18_DEBUG_HI_IRQ("received interrupts " "SW1: %x SW2: %x HW2: %x\n", sw1, sw2, hw2); ... /* Did we get a mailbox ack? */ if (sw2 & IRQ_CPU_TO_EPU_ACK) __wake_up(&cx->mb_cpu_waitq, TASK_NORMAL, 1, cx); ... return (sw1 || sw2 || hw2) ? IRQ_HANDLED : IRQ_NONE; } Regards, Andy -- To unsubscribe from this list: send the line "unsubscribe linux-media" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html