Re: a question of mmap() of files into memory

"Wang Yu" <wangyuict@xxxxxxxxx> · Mon, 24 Nov 2008 11:35:53 +0800

Oh.... As to mmap, it is used to map a file to a process address space
in the result that several processes can communicate with each other
using this shared file memory. The process can access this file just as
accessing the memory not using read/write system call.

On Mon, Nov 24, 2008 at 10:58 AM, Peter Teoh <htmldeveloper@xxxxxxxxx> wrote:

On Mon, Nov 24, 2008 at 7:29 AM, MinChan Kim <minchan.kim@xxxxxxxxx> wrote:

> I'm so late :)

>

> On Sun, Nov 23, 2008 at 4:17 PM, Wang Yu <wangyuict@xxxxxxxxx> wrote:

>>

>>

>> On Sun, Nov 23, 2008 at 2:23 AM, Rik van Riel <riel@xxxxxxxxxxx> wrote:

>>>

>>> Peter Teoh wrote:

>>>>

>>>> when a process mmap() a section of a file into its own process memory,

>>>> the process memory will maintain a copy of the data of that section of

>>>> the file.

>>>

>>> No, it does not maintain a copy.

>>>

>>> It mmaps the page cache pages into its own address space.

>>

>>

>>    According to your explanation, the flow is physical file(on disk)-->Page

>> Cache(on memory, but in kernel space)-->Process Memory(on  memory, but in

>> user space). Is it? I am not sure....

>>>

>

> Yes. When kernel find no page mapping, it allocate new page in page

> cache and copy from on-disk page to new page, then map the new page to

> user space address.

> so, Never duplication.

>

>>>> so...does there exists duplicated buffering?   (one in kernel -

>>>> pagecache, and one in userspace - for mmap() content of the file in

>>>> process memory)

>>>

>>> No, there is no such double buffering.

>>

>>    But what is the difference? Why linux do it?

>

> In case of read system call except O_DIRECT, It's duplication between

> user buffer and page cache.

> Read/write system call abstract page to file. so you always need user buffer.

> Let think. If you want read some data, First of all you need some

> space which is user buffer.

> Mmap system call abstract page to memory.  so you can handle file as

> memory operation without user buffer that mean It don't have

> duplication overhead.

I see.   So u are saying that read() will duplicate buffer between

userspace (user buffer) and kernel (pagecache), but for mmap()

operation, since there is no duplication, all access from user process

will immediate trigger a context switch, to read in the data from

ring0, right?

Since u normally read in blocks of data using read(), and access data

byte-wise with mmap()'s pointer, so read() is much more efficient, as

it trigger much lesser context switches than mmap()'s way of pointer

accessing kernel memory?

Performance-wise, mmap() will perform worst off than read()?

Thanks.

--

Regards,

Peter Teoh

Ernest Hemingway - "Never mistake motion for action."

-- 
National Research Center for Intelligent Computing Systems
Institute of Computing Technology, Chinese Academy of Sciences