On Thu, Jun 17, 2010 at 07:30:53PM +0300, Sudipta GHOSH wrote: > Hi, > > Memory management , scheduling is part of core kernel. > > Is it a process or special code resides in RAM? > > As I see init process has PID 0, so the kernel code is a process or > special code. > > when there is an interrupt, device driver executes some code, in which context? > > How data from userland to kernel space is transferred (user process to driver) > > Thanks, > -S So, the kernel is not a whole process or so, better consider it as a raw bunch of code stored in a specific ring (ring 0) that can be executed in many contexts. The kernel code may be executed in the following contexts: - task - irq - exception = task = There are two ways for the kernel to execute a path in a task context. The first is the syscall path. When you are in userspace and you want to execute a syscall, you call a specific arch instruction that makes you enter the kernel (in x86, this path starts at arch/x86/kernel/entry_32.S, search ENTRY(system_call)). There you will execute some kernel code that services the userspace request (open a file, read, etc..). Say firefox enters the kernel to open a file, things are executed in the context of the firefox task. There are some differences against userspace, like you use a kernel stack there. So for example drivers have their most part accessible from syscalls and then task contexts. The other way for the kernel to be accessed from a task is through a kernel thread, those are internal tasks that do specific jobs. For example there is the idle task, which is the fallback task when there is nothing else to do (no other tasks that want the cpu). The workqueues are another example, they execute some works that need to be done asynchronously. For example when an irq needs to do something that might sleep, it queues a work there. Now concerning the scheduler, most parts of it are executed in task contexts. If you do a context switch for example, task x -> task y, the first half of the context switch is executed by task x, the second half by task y. When yo fork, you execute in the parent context (the child will get its turn on a later context switch). When you wake up a task, you execute in the context of the waker (the wakee will again get its turn on a later context switch). There are some more particular cases like parts of the SMP load balancing are made from the idle task: when there is nothing left to do in a cpu, idle will execute some scheduler code to pull tasks from other cpus. = irq = Irqs can interrupt any task contexts, in fact they can interrupt any context that don't have irqs masked. And irqs are a specific context: they use an irq specific stack, etc... So the small part of the drivers that execute in irq context is their irq handlers. softirqs are a specific case. They are somehow artificially created interrupts. They can be executed in two different contexts: in the end of an irq (a hardirq), then they use the irq stack (but may be some archs use a softirq specific stack, I don't know). Or they can be executed in the context of a task (the ksoftirqd task). This only happens if servicing softirqs takes too much time in the end of a hardirq, and then we want to defer a bit the rest of the softirq work. For that we kick the ksoftirqd task that will relay the rest. There are parts of the scheduler load balancing that are made from softirq. = exception = exceptions happen when userspace or kernelspace execute something that traps (page fault, breakpoint, ...). That too uses a specific stack (at least in x86-64) but is executes in the context that did the exception. > Memory management , scheduling is part of core kernel. > > Is it a process or special code resides in RAM? So it depends, as outlined above, the scheduler code can be executed in different contexts. Memory management is about the same issue: memory structures of a child are allocated on fork (from the parent), or exec (from the task that exec). Later on, page faults are serviced from exception (usually in the context of a task). But in fact memory management also has its own threads. kswapd can be kicked to swap memory on need. The writeback also has its own threads, etc... So don't consider the Memory management or the scheduler as tasks or irqs (although part of them may use irqs or specific tasks), rather consider them as "libraries", this is what they are except for some standalone parts of them. In fact this is the same for the whole kernel: it is mostly a big library, but also with some standalone parts. > How data from userland to kernel space is transferred (user process to driver) When kernel accesses userspace datas, this is in the context of a task (mostly, it can also be from irqs), but the userspace memory of this task is pageable. And the address of the userspace pointer can be a bad one. We use copy_from_user() and copy_to_user() to handle that. - if the pointer points to a page that is on memory, it's fine - if the memory pointed is swapped, there is a page fault, and the page is retrieved, it's fine - if it's a bad pointer, there will be a page fault, but it won't crash because copy_from/to_user will tell it can handle this page fault (setting a specific fixup for this), and it will do so by returning an error. If this is made from irq, we can't sleep, so we can't play with swapping. In this case we are only able to fetch user memory if it is not on the swap. -- To unsubscribe from this list: send an email with "unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx Please read the FAQ at http://kernelnewbies.org/FAQ