Re: some doubts on cache coherence

Gaurav Dhiman <gauravd.chd@xxxxxxxxx> · Fri, 3 Feb 2006 19:58:38 +0530

On 2/3/06, Mulyadi Santosa <mulyadi.santosa@xxxxxxxxx> wrote:
> Hello Prabhat...
>
> Please, CC to kernelnewbies too like I do.... our conversation might
> help someone in the future....
>
> >     But can you sleep in your bottom half? basically y top & bottom
> > halves were created is that you do the job of acknolwedging the
> > inpterrupt in the top half & defer the other major work to the bh.
>
> If the bottom halves works in process context, then yes, it is allowed
> to sleep. The example is workqueue.

I think BH are not executed in process context, they are executed
while returning from system call, interrupt and exception handlers. As
BH handlers can be registered by top handlers of different drivers
which serve different interrupts, the can not be executed in process
context and I think BH should not make the process to sleep or
reschedule.

You can see that at the end of do_IRQ() function, we call
do_softirqs(), which actaully call all the active and registered
softirqs and we still are not in process context. softirqs is also BH
like mechanism to defer some work.

Well this is my understanding, not sure about it, can someone put more
light on it.

>
> >  This is what I assume when ever a h/w  interupt comes :-
> > top half: acknowledge the interrupt, put the interrupt to the
> > registered interrupt handler and then schedule the bh later and quit.
> > bh: here the interrupt handler takes care of the real handling of
> > that interrupt.
> >      is this how it happens?
>
> You got the general concept correctly....
>
> >  so in the bh execution, can you sleep? what if the requested page by
> > the interrupt handler is not in memory?
>
> that situation might happen if the bottom halves is accessing user space
> memory, which might be swapped out. But usuall, bottom halves are
> accessing kernel space memory and since kernel space memory is never
> swapped out, the bottom halves can operate without sleep. A small
> problem might arise if it needs to access memory located on
> ZONE_HIGHMEM (above 896 MB mark), but it is solvable via temporary
> mapping.
>
> > Spinlocks:
> >      My doubt was this:
> >    Consider CPU1 and CPU2 are two CPUs and processes P1 and P2 are
> > two processes running on them respectively. P1 & P2 share the same
> > data. So P1 holds a spinlock on that data and enters its cs. Now if
>
> cs--> critical section?
>
> > P2 running/holding on CPU2 tries to hold the same lock, then it keeps
> > spinning it its time quantum continuosly. So what P2 is doing is
> > instead of returning back, it is spinning holding CPU2 while CPU2
> > could be used to do some other work. But CPU2 thinks P2 is doing some
> > useful work.
>
> Correct. This is especially true if the kernel is not compiled with
> kernel level preemption turned on.

I dont think, kernel preemption have any role here. When kernel
preemption is not enabled and we are in kernel mode (las in your case
P2 looping on spinloop), after serving any interrupts, we will again
be put back to that busy loop only, so P2 will keep on looping till
the spinlock is not release by P1, it does not matter even if the
quantum of P2 has been expired.

Lets say, in case kernel preemption is enabled. In this case when P2
will try to hold spinlock it will call any of the following functions:

253 #define _spin_lock(lock)        \
254 do { \
255         preempt_disable(); \
256         _raw_spin_lock(lock); \
257         __acquire(lock); \
258 } while(0)

295 #define _spin_lock_irqsave(lock, flags) \
296 do {    \
297         local_irq_save(flags); \
298         preempt_disable(); \
299         _raw_spin_lock(lock); \
300         __acquire(lock); \
301 } while (0)

303 #define _spin_lock_irq(lock) \
304 do { \
305         local_irq_disable(); \
306         preempt_disable(); \
307         _raw_spin_lock(lock); \
308         __acquire(lock); \
309 } while (0)

In each of these functions, we can see that we are first disabling the
preemption (calling preempt_disable() function) and then going into
busy loop (calling _raw_spin_lock(lock) function), so before going
into buy loop P2 will disable the preemption, resulting in that
whenever any interrupt occurs, will again return back to busy loop
only.

We can see that enabling or disabling preemption does not affect our
case, P2 will always be busy looping till the P1 does not release
spinlock. Rather in two of the above spinlock definitions, we are
explicitly disabling the interrupts on local CPU, which totally remove
the chance of preemption (except if an exception occurs), as no
interrupts will occur so no preemption will occur.

I think this has been done with an assumption that no process will
hold the spinlock for a long time, that is why we say; hold spinlock
only for very short time and do not reschedule the processor in
critical setion when you are holding spinlock. As spinlocks are held
for very small time, so P2 will also loop for a very short time.

-Gaurav

>
> >     So why cant spin_try_lock() be used in all places because in this
> > case P2 will return immediately saying error (meaning lock cannot be
> > held) and that CPU2 could do some other useful work.
>
> IMHO, testing for ability to actually grab the lock before actually
> grabbing the lock itself is good practice to avoid lock contention.
>
> > > > which cache? L1/L2 cache? page cache? internal disk cache?
> >
> >      I got a very nice link regarding this.
>
> Great. reading Linux Device Drivers 3rd edition (available online) will
> also provide valuable hint too.
>
> regards
>
> Mulyadi
>
>
> --
> Kernelnewbies: Help each other learn about the Linux kernel.
> Archive:       http://mail.nl.linux.org/kernelnewbies/
> FAQ:           http://kernelnewbies.org/faq/
>
>

--
Kernelnewbies: Help each other learn about the Linux kernel.
Archive:       http://mail.nl.linux.org/kernelnewbies/
FAQ:           http://kernelnewbies.org/faq/