File/IO discussion - request for comments

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



  In a recent O'Reilly Press book "Java NIO" by R. Hitchens, pages 9-13 deal with generic Unix
file I/O; although the argument is garbled it is implied that _all_ file I/O type reads are
accomplished though demand paging generated by the pagefault handler.

  I am pretty sure that this is not the case in linux and have written the following outline
to explore this.

  Caveat: we assume that memory pagesize and the fs block size are both 4K, and also that no 
system error occurs and that errno remains zero throughout.

  All regular file I/O in linux 2.6.1 takes one of two alternatives: through the read/write family
of syscalls or through direct memory operations in userspace on mmapped files. Both methods 
utilize the page cache.

FIRST ALTERNATIVE: read() syscall

  The system receives a file descriptor, an offset, a count and a userspace buffer address. When
the syscall returns it has copied from 0 <= n <= count bytes to the buffer. How many bytes
are copied is dependent on the state of the pagecache, together with the blocking/nonblocking mode
of the file.

  To start with the syscall examines the page cache to see if any of the requested pages are there
So, if the read offset was 10,000 and the count 20,000 then the system tries to find the pages
containing fs blocks 2 thru 7 - bytes 8192 through 31767. The amount of data found in the page
cache interacts with the blocking/nonblocking mode of the file as follows:

    Blocking read:
      No pages found: queue I/O and sleep on queue.
      Less than all pages found:
         Lock found pages in memory.
         Queue I/O for remaining pages and sleep on queue.
      All pages found:
         Copy the request from the found pages to the buffer.
         Return the count.

    Nonblocking read:
      No pages found: return 0:
      Less than all pages found:
         The found pages contain some initial portion of the request:
            Copy that initial portion from the found pages to the userspace buffer.
            Return the number of bytes copied.
         No initial portion found: return 0
      All pages found:
         Copy the request from the found pages to the buffer.
         Return the count.

  Notwithstanding Hitchen's claim, there is nothing in the above that has to do with pagefaults
except in case the pagetable(s) for the userspace buffer are marked 'not present' .  Moreover no
pagefaults can occur in kernel space except on kernel page allocations (..???...)

SECOND ALTERNATIVE: operations on mmapped non-anonymous memory

  This is in some ways the opposite of the above.  The syscall takes place entirely in kernel 
space while the memory operations in this alternative are nominally entirely in user space; 
moreover, the syscall method may have to deal with multiple pages but the memory mapped method 
deals with only a single page at a time.

Page is present: do memory ops in user space.
Page is not present:  pagefault handler utilizes fs 'readpage' method. Resume in user space.

  So demand paging does indeed take place in the case of a mapped memory address with a page 
marked 'not present' - but not otherwise and most emphatically not in _every_ case of filesystem 
I/O.






=====
Carl Spalletta
--
Proposition:
  The will of God is always and everywhere continously being done.
Proof by contradiction:
  If this were not so, then sooner or later it would be MY turn.
Q.E.D.

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free web site building tool. Try it!
http://webhosting.yahoo.com/ps/sb/

--
Kernelnewbies: Help each other learn about the Linux kernel.
Archive:       http://mail.nl.linux.org/kernelnewbies/
FAQ:           http://kernelnewbies.org/faq/


[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]
  Powered by Linux