Re: Why no trylock for read/write_bh?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



apologized if i get my terminologies wrong, or explanation
unclear......tried my best.....

On Mon, Apr 13, 2009 at 10:36 AM, Jeffrey Cao <jcao.linux@xxxxxxxxx> wrote:
> On 2009-04-12, Peter Teoh <htmldeveloper@xxxxxxxxx> wrote:
>> On Sun, Apr 12, 2009 at 11:25 AM, Jeffrey Cao <jcao.linux@xxxxxxxxx> wrote:
>
>> basic concept is that there 3 types of context:   process, hardware
>> interrupt, and software interrupt.
> If you think so, what's the difference between hardware interrupt context and
> software interrupt context?
> What I know is that there are only two types of contextes: process, and
> interrupt. Hardware interrupt handler and deferrable functions(including
> softirq and tasklete) run in interrupt context.
>

well....hardware interrupt context is when the hardware IRQ flag is
still set, which will be turned off (via hardware - not software) when
u do a iret.

software interrupt context is a kernel processing loop trying to clear
up all the remaining activities outstanding from hardware interrupt
context.

and process context is when the kernel switch over to execute with
full pagetable addrespace ownership (see below).

In Intel manual there is something called "software
interrupts".....nothing to do with anything mentioned here.

>>
>> read this for further details (from LDDv3):
>>
>> http://lwn.net/images/pdf/LDD3/ch07.pdf
>>
>> within it explained what is a kernel thread.   So basically while
>> running process A in kernel mode, it can be intercepted by any kernel
>> threads, and these kernel threads therefore "DOES NOT HAVE PROCESS
>> CONTEXT", as they are not related to the currently running process A,
> If these threads does not have process context, then what context are they in?
> hardware interrupt context? software interrupt context? Or you'll create another
> new context named "thread context"?
>

Firstly, it is related to Intel hardware MMU/pagetable/CR3 mechanism.
 In Linux kernel design, there is no change in the pagetables when
transition from kernel to user space (and vice versa) are made -
because of performance cost (called lazy-TLB).   It is switch only
when there is a process switch - by changing the CR3.

"No process context" actually means that the taskstruct's
mm_struct->mm is NULL.   This means that the pagetable CR3 are not
changed from its previous value.   Therefore, whatever u read/write
to, u are reading/writing to the previous owner of the address space,
which is why when u do things like copy_to_user() from kernel threads,
u are copying to any arbitrary process that happened to be running
BEFORE the kernel thread is switched.

For eg,

/*
 * Access another process' address space.
 * Source/target buffer must be kernel space,
 * Do not walk the page table directly, use get_user_pages
 */
int access_process_vm(struct task_struct *tsk, unsigned long addr,
void *buf, int len, int write)
{
        struct mm_struct *mm;
        struct vm_area_struct *vma;
        void *old_buf = buf;

        mm = get_task_mm(tsk);
        if (!mm)
                return 0;

The above (!mm) check actually means that the API access_process_vm()
MUST NOT be executed from a kernel thread env, which does not have any
process context.

>> and therefore, must observe rules (described in ch07.pdf) like I/O,
>> sleeping etc. etc.
>>
>

For eg, if u do "insmod xxx", your kernel module is running in the
process context of the "insmod" process.

read this:

http://lwn.net/Articles/147782/

in PREEMPT_RT, there is even a 4th type of context introduced!!!

u are right that there is two context - for eg (executing workqueue in
process context):

int execute_in_process_context(work_func_t fn, struct execute_work *ew)
{
        if (!in_interrupt()) {
                fn(&ew->work);
                return 0;
        }

        INIT_WORK(&ew->work, fn);
        schedule_work(&ew->work);

        return 1;
}

where it distinguish between process context vs no process context
just via interrupt process mode.  why is in_softirq() not filtered
off?

/*
 * Are we doing bottom half or hardware interrupt processing?
 * Are we in a softirq context? Interrupt context?
 */
#define in_irq()                (hardirq_count())
#define in_softirq()            (softirq_count())
#define in_interrupt()          (irq_count())

continue to puzzle it out.............

-- 
Regards,
Peter Teoh

--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ



[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]
  Powered by Linux