On 10/22/07, Shriramana Sharma <samjnaa@xxxxxxxxx> wrote: > Steve Graegert wrote: > > your questions can be answered in very great detail and fill whole > > books (which has actually happened), so please forgive my brief > > answers. > > Brief answers are sufficient. Thank you. :) > > > and the init process (PID 1) separately. The scheduler then takes > > control over the system as soon as the kernel goes idle and init > > starts the /etc/rc.d/rc.sysinit script. > > So would the kernel be "reawakened" by some external event occurring, as > Glynn wrote? Yes, exactly. > I am thinking like -- the kernel is a body of instructions, > just like any other program. So like I do something (type "cp" at a > terminal, for ex) to have the program called "cp" to be processed -- > that is, the instructions in it are executed. Similarly when a system > call is made by any program, the instructions contained in the kernel > would be executed. Is this understanding right? That is basically correct. > > When a system call is invoked, the process which has invoked is being > > interrupted and all the information needed to continue this process is > > stored for later use. Now, the process executes higher privileged > > code that determines what to do next from examining the stack of the > > less privileged code. It contains information about the request to be > > served by the kernel. When it is finished, it control is returned to > > the program, restoring the saved state and continuing program > > execution. > > So technically the kernel instructions would be processed "in" the same > process and thread that makes the system call, albeit with different > privelege, right? Or is the process *making the system call* totally > "paused", as in a "wait" situation when one process waits for another to > finish? Well, no. Whenever a user program is invoked a new process is created. Nevertheless, the system call is handled in the kernel's address space which is made available by a "context switch" (int 0x80) from user mode to kernel mode. When a process requests the kernel to do something which is currently impossible but that may become possible later, the process is put to sleep and is woken up when the request is more likely to be satisfied. One of the kernel mechanisms used for this is called a "wait queue". When a userspace application makes a system call, the arguments are passed via registers and the application executes "int 0x80" (x86) instruction. This causes a trap into kernel mode and processor jumps to system_call entry point in entry.S. What this does is: 1. Save registers. 2. Set %ds (data segment) and %es (extra segment) to KERNEL_DS, allowing the references to be made in the kernel address space. 3. If the value of %eax (the system call number supposed to serve the request is stored here) is greater than NR_syscalls (max no. of system calls, currently 256), fail with ENOSYS error. 4. Call sys_call_table (initialized at boot up) with syscall_number argument from %eax. This table (arch/i386/kernel/entry.S) points to individual system call handlers which will find their arguments on the stack. 5. Enter the so called "system call return path" checking if a call to schedule() is needed, checking for pending signals and if so handling them. 6. Check for errors > Do I also understand right that the kernel resides in a special section > of memory the boot loader puts it in? The core of the kernel (apart from > dynamically loadable kernel modules) is the vmlinuz file sitting in my > /boot dir, right? Short answer: yes. Long answer: Linux actually implements a two stage boot process. In the first stage, the BIOS loads the boot program (Initial Program Loader) from the hard disk to the memory. In the second stage, the boot program loads the operating system kernel vmlinuz into memory and uncompresses it, since vmlinuz is a compressed kernel image (protected mode). When the kernel loads into memory, it performs a memory test. Most of the kernel data structures are initialized statically. So it sets aside a part of memory for kernel use. This part of the memory cannot be used by any other processes. It also reports the total amount of physical memory available and sets aside those for the user processes. Hope that was helpful. \Steve -- Steve Grägert DigitalEther.de - To unsubscribe from this list: send the line "unsubscribe linux-c-programming" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html