Re: some doubts on cache coherence

Prabhat Hegde <pubs.hegde@xxxxxxxxx> · Fri, 3 Feb 2006 13:30:38 +0530

Hello Mulyadi,
     Tnaks for the rememberence. I have added the cc list. Now,

> If the bottom halves works in process context, then yes, it is allowed

> to sleep. The example is workqueue.

> that situation might happen if the bottom halves is accessing user space

> memory, which might be swapped out. But usuall, bottom halves are

> accessing kernel space memory and since kernel space memory is never
> swapped out, the bottom halves can operate without sleep. 

        In your first line you have mentioned if the bh executes in process context then it can sleep. i.e workqueue.
Now if its executing in process context then it can acces user space and user space memory can be swapped out. So bh can actuallu put to 

sleep in this manner whenever there is a page fault. 

> A small problem might arise if it needs to access memory located on

> ZONE_HIGHMEM (above 896 MB mark), but it is solvable via temporary

> mapping.
      What is this temporary mapping, can you please explain? is it being done via mmap() system call? 

> Correct. This is especially true if the kernel is not compiled with

> kernel level preemption turned on.
      Now if the kernel is compiled with the preemption, then what happens to the the process P2. I assume it can be preempted only if an interrupt comes in or an exception occurs. But if some other process P3 wants to execute on CPU2, then it will be denied with service. So isnt this in-efficient? 

There is another doubt. 
          So prevent the locking stuff, each CPU maintians per CPU data structures. i.e any global data is copied onto the per CPU data structure to prevent synchronisation problems on that global data structure. But if we reduce the synchronisation overhead wont we introduce another overhead of updating all those per CPU data structures whenever a change happens in the global data structure? Or did this concept arise assuming there are very less changes made to those global data structures.

Example: routing table and flow cache in IPSec processing.
     Here we can say the flow cache is a subset of the routing cache. So the policy search is made on the flow cache which actually is a per CPU data structure maintained by each CPU separately. Now whenver the routing table is undergoing a change, the corresponding values must be updated in the flow cache. Wont this updatation introduce overhead. 

     So we are removing one overhead & introducing the other.

On 2/3/06, Mulyadi Santosa <
mulyadi.santosa@xxxxxxxxx> wrote:Hello Prabhat...

Please, CC to kernelnewbies too like I do.... our conversation might

help someone in the future....

>     But can you sleep in your bottom half? basically y top & bottom
> halves were created is that you do the job of acknolwedging the
> inpterrupt in the top half & defer the other major work to the bh.

If the bottom halves works in process context, then yes, it is allowed
to sleep. The example is workqueue.

>  This is what I assume when ever a h/w  interupt comes :-
> top half: acknowledge the interrupt, put the interrupt to the

> registered interrupt handler and then schedule the bh later and quit.
> bh: here the interrupt handler takes care of the real handling of
> that interrupt.
>      is this how it happens?

You got the general concept correctly....

>  so in the bh execution, can you sleep? what if the requested page by
> the interrupt handler is not in memory?

that situation might happen if the bottom halves is accessing user space

memory, which might be swapped out. But usuall, bottom halves are
accessing kernel space memory and since kernel space memory is never
swapped out, the bottom halves can operate without sleep. A small
problem might arise if it needs to access memory located on

ZONE_HIGHMEM (above 896 MB mark), but it is solvable via temporary
mapping.

> Spinlocks:
>      My doubt was this:
>    Consider CPU1 and CPU2 are two CPUs and processes P1 and P2 are
> two processes running on them respectively. P1 & P2 share the same

> data. So P1 holds a spinlock on that data and enters its cs. Now if

cs--> critical section?

> P2 running/holding on CPU2 tries to hold the same lock, then it keeps
> spinning it its time quantum continuosly. So what P2 is doing is

> instead of returning back, it is spinning holding CPU2 while CPU2
> could be used to do some other work. But CPU2 thinks P2 is doing some
> useful work.

Correct. This is especially true if the kernel is not compiled with

kernel level preemption turned on.

>     So why cant spin_try_lock() be used in all places because in this
> case P2 will return immediately saying error (meaning lock cannot be
> held) and that CPU2 could do some other useful work.

IMHO, testing for ability to actually grab the lock before actually
grabbing the lock itself is good practice to avoid lock contention.

> > > which cache? L1/L2 cache? page cache? internal disk cache?

>
>      I got a very nice link regarding this.

Great. reading Linux Device Drivers 3rd edition (available online) will
also provide valuable hint too.

regards

Mulyadi