Query on Kernel Timers and Process Context Code

Hareesh Nagarajan <hareesh.nagarajan@xxxxxxxxx> · Tue, 7 Feb 2006 01:17:43 -0600

Hi,

I have this very basic doubt on the prevention of races between kernel
timers and process context code. Suppose, I have this shared
datastructure, that can be accessed from process context as well as
from interrupt (timer - so softirq context), my query is this: How
must I go ahead with the locking. Here are my thoughts.

The shared datastructure is a list pk_task_share structures:
struct pk_task_share {
        struct list_head list;
        unsigned int status;
        struct pk_share *share;
        struct timer_list timer;
};

The list and a global variable pk_global are protected by a spin_lock
called pk_lock. (Not the real case, but it serves our purpose)

So in process context code (syscall) I do this

long sys_foo(...) {
    ...
    spin_lock_bh(&pk_lock);
    if(pk_global & SYS_FOO_INTERRUPTED) {
        do_something
        spin_unlock_bh(&pk_lock);
        return -EINTR;
    }
    pk_global |= SYS_FOO_IN_PROGRESS;
    do_something
    spin_unlock_bh(&pk_lock);
    ...
}

The timer associated with pk_task_share:
    ...
    task_share->timer.function = &pk_timer_expiry;
    ...
calls pk_timer_expiry.

void pk_timer_expiry(unsigned long ...) {
    ...
    spin_lock(&pk_lock);
    if(pk_global & SYS_FOO_IN_PROGRESS) {
        pk_global |= SYS_FOO_INTERRUPTED;
        spin_unlock(&pk_lock);
        return;
    }
    else {
        modify the pk_task_share list
    }
    spin_unlock(&pk_lock);
}

Does this sort of logic seem racy on SMP machines, or is it safe? I
don't think I'd need to use an atomic type for pk_global.

The other question that I had was this: If the CPU were interrupted by
the timer interrupt prior to the spin_lock_bh(&pk_lock) call in
sys_foo, then would pk_timer_expiry execute completely and then the
execution resumes in sys_foo, or could we be midway in the execution
of pk_timer_expiry when sys_foo resumes execution?

This question stems from what Rusty is trying to tell us here:
``Sooner or later, this will crash on SMP, because a timer can have
just gone off before the spin_lock_bh(), and it will only get the lock
after we spin_unlock_bh(), and then try to free the element (which has
already been freed!)."

URL:
http://www.kernel.org/pub/linux/kernel/people/rusty/kernel-locking/x450.html

Does he mean the timer went off, but even before the CPU could start
executing the expiry related code, the process context code, gets hold
of spin_lock_bh?

Thanks,

Hareesh

--
Kernelnewbies: Help each other learn about the Linux kernel.
Archive:       http://mail.nl.linux.org/kernelnewbies/
FAQ:           http://kernelnewbies.org/faq/