Re: File/IO discussion - request for comments

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Feb 01, 2004 at 15:52:01 -0800, Carl Spalletta wrote:
>   In a recent O'Reilly Press book "Java NIO" by R. Hitchens, pages 9-13 deal with generic Unix
> file I/O; although the argument is garbled it is implied that _all_ file I/O type reads are
> accomplished though demand paging generated by the pagefault handler.
> 
>   I am pretty sure that this is not the case in linux and have written the following outline
> to explore this.
> 
>   Caveat: we assume that memory pagesize and the fs block size are both 4K, and also that no 
> system error occurs and that errno remains zero throughout.
> 
>   All regular file I/O in linux 2.6.1 takes one of two alternatives: through the read/write family
> of syscalls or through direct memory operations in userspace on mmapped files. Both methods 
> utilize the page cache.
> 
> FIRST ALTERNATIVE: read() syscall
> 
>   The system receives a file descriptor, an offset, a count and a userspace buffer address. When
> the syscall returns it has copied from 0 <= n <= count bytes to the buffer. How many bytes
> are copied is dependent on the state of the pagecache, together with the blocking/nonblocking mode
> of the file.
> 
>   To start with the syscall examines the page cache to see if any of the requested pages are there
> So, if the read offset was 10,000 and the count 20,000 then the system tries to find the pages
> containing fs blocks 2 thru 7 - bytes 8192 through 31767. The amount of data found in the page
> cache interacts with the blocking/nonblocking mode of the file as follows:
> 
>     Blocking read:
>       No pages found: queue I/O and sleep on queue.
>       Less than all pages found:
>          Lock found pages in memory.
>          Queue I/O for remaining pages and sleep on queue.
>       All pages found:
>          Copy the request from the found pages to the buffer.
>          Return the count.
> 
>     Nonblocking read:
>       No pages found: return 0:
>       Less than all pages found:
>          The found pages contain some initial portion of the request:
>             Copy that initial portion from the found pages to the userspace buffer.
>             Return the number of bytes copied.
>          No initial portion found: return 0
>       All pages found:
>          Copy the request from the found pages to the buffer.
>          Return the count.

There is no non-blocking read from disk!
There is only an aio_read, which is a different syscall.

>   Notwithstanding Hitchen's claim, there is nothing in the above that has to do with pagefaults
> except in case the pagetable(s) for the userspace buffer are marked 'not present' .  Moreover no
> pagefaults can occur in kernel space except on kernel page allocations (..???...)

There are no page-fault per se in kernel. Pagefault happens when a page
is accessed for which there is no entry in page-table -- and that simply
never happens in kernel. However, the mechanizm for loading pages is the
same for read as for page-fault.

> SECOND ALTERNATIVE: operations on mmapped non-anonymous memory
> 
>   This is in some ways the opposite of the above.  The syscall takes place entirely in kernel 
> space while the memory operations in this alternative are nominally entirely in user space; 
> moreover, the syscall method may have to deal with multiple pages but the memory mapped method 
> deals with only a single page at a time.
> 
> Page is present: do memory ops in user space.
> Page is not present:  pagefault handler utilizes fs 'readpage' method. Resume in user space.
> 
>   So demand paging does indeed take place in the case of a mapped memory address with a page 
> marked 'not present' - but not otherwise and most emphatically not in _every_ case of filesystem 
> I/O.

-------------------------------------------------------------------------------
						 Jan 'Bulb' Hudec <bulb@ucw.cz>

--
Kernelnewbies: Help each other learn about the Linux kernel.
Archive:       http://mail.nl.linux.org/kernelnewbies/
FAQ:           http://kernelnewbies.org/faq/


[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]
  Powered by Linux