On Tue, 22 Mar 2005 9:49:45 -0600, stackframe@xxxxxxxxxxx <stackframe@xxxxxxxxxxx> wrote: > So is it in the case of the sys_write that the memory is > accessed and allowed to page fault as needed while running > in the context of the kernel? Most of the kernel books > talk about using copy_from_user on system calls before > using memory from the user. Strace shows one write > call with the entire length. > > So what is going on to allow this to happen? What filesystem did you use? Ext2 uses generic_* functions that I am a bit familiar with in its file_operations structure, ext3 uses some other but I think the difference is journalling, not accessing userland memory. As you can see in fs/filemap.c on the line 1963 (kernel 2.6.11.1), the call can return -ENOMEM if no page cache page can be allocated for the write. Similarly, a few lines down the code you can see that filemap_copy_from_user is used to copy data from the user memory to the page cache. This means the kernel was able to get at least one page to copy the data to and was able to get away with it. I presume the kernel simply won't give all the memory to the user processes because it knows things would go bad rather soon if it did. However, somebody who undertands memory management has to confirm this. (Moreover, I wouldn't be surprised if kernel left some memory for the root so that he can kill memory exploiting user processes without getting -ENOMEM). HTH Martin Jambor -- Kernelnewbies: Help each other learn about the Linux kernel. Archive: http://mail.nl.linux.org/kernelnewbies/ FAQ: http://kernelnewbies.org/faq/