On 09/10, Paul E. McKenney wrote: > > Work in progress, not for inclusion. Impressive work! a couple of random newbie's questions... > --- linux-2.6.22-b-fixbarriers/include/linux/rcupdate.h 2007-07-19 14:02:36.000000000 -0700 > +++ linux-2.6.22-c-preemptrcu/include/linux/rcupdate.h 2007-08-22 15:21:06.000000000 -0700 > > extern void rcu_check_callbacks(int cpu, int user); Also superfluously declared in rcuclassic.h and in rcupreempt.h > --- linux-2.6.22-b-fixbarriers/include/linux/rcupreempt.h 1969-12-31 16:00:00.000000000 -0800 > +++ linux-2.6.22-c-preemptrcu/include/linux/rcupreempt.h 2007-08-22 15:21:06.000000000 -0700 > + > +#define __rcu_read_lock_nesting() (current->rcu_read_lock_nesting) unused? > diff -urpNa -X dontdiff linux-2.6.22-b-fixbarriers/kernel/rcupreempt.c linux-2.6.22-c-preemptrcu/kernel/rcupreempt.c > --- linux-2.6.22-b-fixbarriers/kernel/rcupreempt.c 1969-12-31 16:00:00.000000000 -0800 > +++ linux-2.6.22-c-preemptrcu/kernel/rcupreempt.c 2007-08-22 15:35:19.000000000 -0700 > > +static DEFINE_PER_CPU(struct rcu_data, rcu_data); > +static struct rcu_ctrlblk rcu_ctrlblk = { > + .fliplock = SPIN_LOCK_UNLOCKED, > + .completed = 0, > +}; > static DEFINE_PER_CPU(int [2], rcu_flipctr) = { 0, 0 }; Just curious, any reason why rcu_flipctr can't live in rcu_data ? Similarly, rcu_try_flip_state can be a member of rcu_ctrlblk, it is even protected by rcu_ctrlblk.fliplock Isn't DEFINE_PER_CPU_SHARED_ALIGNED better for rcu_flip_flag and rcu_mb_flag? > +void __rcu_read_lock(void) > +{ > + int idx; > + struct task_struct *me = current; > + int nesting; > + > + nesting = ORDERED_WRT_IRQ(me->rcu_read_lock_nesting); > + if (nesting != 0) { > + > + /* An earlier rcu_read_lock() covers us, just count it. */ > + > + me->rcu_read_lock_nesting = nesting + 1; > + > + } else { > + unsigned long oldirq; > + > + /* > + * Disable local interrupts to prevent the grace-period > + * detection state machine from seeing us half-done. > + * NMIs can still occur, of course, and might themselves > + * contain rcu_read_lock(). > + */ > + > + local_irq_save(oldirq); Could you please tell more, why do we need this cli? It can't "protect" rcu_ctrlblk.completed, and the only change which affects the state machine is rcu_flipctr[idx]++, so I can't understand the "half-done" above. (of course, we must disable preemption while changing rcu_flipctr). Hmm... this was already discussed... from another message: > Critical portions of the GP protection happen in the scheduler-clock > interrupt, which is a hardirq. For example, the .completed counter > is always incremented in hardirq context, and we cannot tolerate a > .completed increment in this code. Allowing such an increment would > defeat the counter-acknowledge state in the state machine. Still can't understand, please help! .completed could be incremented by another CPU, why rcu_check_callbacks() running on _this_ CPU is so special? > + > + /* > + * Outermost nesting of rcu_read_lock(), so increment > + * the current counter for the current CPU. Use volatile > + * casts to prevent the compiler from reordering. > + */ > + > + idx = ORDERED_WRT_IRQ(rcu_ctrlblk.completed) & 0x1; > + smp_read_barrier_depends(); /* @@@@ might be unneeded */ > + ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])++; Isn't it "obvious" that this barrier is not needed? rcu_flipctr could be change only by this CPU, regardless of the actual value of idx, we can't read the "wrong" value of rcu_flipctr[idx], no? OTOH, I can't understand how it works. With ot without local_irq_save(), another CPU can do rcu_try_flip_idle() and increment rcu_ctrlblk.completed at any time, we can see the old value... rcu_try_flip_waitzero() can miss us? OK, GP_STAGES > 1, so rcu_try_flip_waitzero() will actually check both 0 and 1 lastidx's before synchronize_rcu() succeeds... I doubt very much my understanding is correct. Apart from this why GP_STAGES > 1 ??? Well, I think this code is just too tricky for me! Will try to read again after sleep ;) > +static int > +rcu_try_flip_idle(void) > +{ > + int cpu; > + > + RCU_TRACE_ME(rcupreempt_trace_try_flip_i1); > + if (!rcu_pending(smp_processor_id())) { > + RCU_TRACE_ME(rcupreempt_trace_try_flip_ie1); > + return 0; > + } > + > + /* > + * Do the flip. > + */ > + > + RCU_TRACE_ME(rcupreempt_trace_try_flip_g1); > + rcu_ctrlblk.completed++; /* stands in for rcu_try_flip_g2 */ > + > + /* > + * Need a memory barrier so that other CPUs see the new > + * counter value before they see the subsequent change of all > + * the rcu_flip_flag instances to rcu_flipped. > + */ Why? Any code sequence which relies on that? For example, rcu_check_callbacks does if (rcu_ctrlblk.completed == rdp->completed) rcu_try_flip(); it is possible that the timer interrupt calls rcu_check_callbacks() exactly because rcu_pending() sees rcu_flipped, but without rmb() in between we can see the old value of rcu_ctrlblk.completed. This is not a problem because rcu_try_flip() needs rcu_ctrlblk.fliplock, so rcu_try_flip() will see the new state != rcu_try_flip_idle_state, but I can't understand any mb() in rcu_try_flip_xxx(). > +static void rcu_process_callbacks(struct softirq_action *unused) > +{ > + unsigned long flags; > + struct rcu_head *next, *list; > + struct rcu_data *rdp = RCU_DATA_ME(); > + > + spin_lock_irqsave(&rdp->lock, flags); > + list = rdp->donelist; > + if (list == NULL) { > + spin_unlock_irqrestore(&rdp->lock, flags); > + return; > + } Do we really need this fastpath? It is not needed for the correctness, and this case is very unlikely (in fact I think it is not possible: rcu_check_callbacks() (triggers RCU_SOFTIRQ) is called with irqs disabled). > +void fastcall call_rcu(struct rcu_head *head, > + void (*func)(struct rcu_head *rcu)) > +{ > + unsigned long oldirq; > + struct rcu_data *rdp; > + > + head->func = func; > + head->next = NULL; > + local_irq_save(oldirq); > + rdp = RCU_DATA_ME(); > + spin_lock(&rdp->lock); This looks a bit strange. Is this optimization? Why not spin_lock_irqsave(&rdp->lock, flags); rdp = RCU_DATA_ME(); ... ? RCU_DATA_ME() is cheap, but spin_lock() under local_irq_save() spins without preemption. Actually, why do we need rcu_data->lock ? Looks like local_irq_save() should be enough, no? perhaps some -rt reasons ? > + __rcu_advance_callbacks(rdp); Any reason this func can't do rcu_check_mb() as well? If this is possible, can't we move the code doing "s/rcu_flipped/rcu_flip_seen/" from __rcu_advance_callbacks() to rcu_check_mb() to unify the "acks" ? > +void __synchronize_sched(void) > +{ > + cpumask_t oldmask; > + int cpu; > + > + if (sched_getaffinity(0, &oldmask) < 0) > + oldmask = cpu_possible_map; > + for_each_online_cpu(cpu) { > + sched_setaffinity(0, cpumask_of_cpu(cpu)); > + schedule(); This "schedule()" is not needed, any time sched_setaffinity() returns on another CPU we already forced preemption of the previously active task on that CPU. > + } > + sched_setaffinity(0, oldmask); > +} Well, this is not correct... but doesn't matter because of the next patch. But could you explain how it can deadlock (according to the changelog of the next patch) ? Thanks! Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html