Re: Excessive mmap [was Git server eats all memory]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Avery,

Avery Pennarun <apenwarr@xxxxxxxxx> wrote:

> ... When you access some of the pages of the mmap'd file, the kernel
> will swap those pages into memory, which increases RSS.  This uses
> *real* memory on the system...

Thanks for the very clear explanations

> Now, the kernel is supposed to be smart enough to release old pages
> out of RSS if you stop using them; it's no different from what the
> kernel does with any cached file data.  So it shouldn't be expensive
> to mmap instead of just reading the file.

How can the kernel release old pages? There does not seem to be anyway
to tell it that it doesn't need a given memory block.

>> Looking some more into it today the bulk of the memory allocation
>> happens in write_pack_file in the following loop.
>>
>> for (; i < nr_objects; i++) {
>>    if (!write_one(f, objects + i, &offset))
>>        break;
>>    display_progress(progress_state, written);
>> }
>>
>> This eventually calls write_object, here I am wondering if the
>> unuse_pack function is doing its job. As far as I can tell it writes a
>> null in memory, that I think is not enough to reclaim memory.
>
> What do you mean by the "memory allocation" happens here?  How are you
> measuring it?

I run top and look at the RES column. I put a printf before and after
the code block and watch the memory go up and up.

>> I also looked at the use_pack function where the mmap is
>> happening. Would it be worth refactoring this function so that it uses
>> an index withing a file instead of mmap?
>>
>> Unless I hear of a better idea I'll be trying that tomorrow...
>
> I wouldn't expect this to help, but I would be interested to hear if
> it does.

I got caught up with other thing at work but I think I'll be able to try
Friday.

> If the problem is simply that you're flooding the kernel disk cache
> with data you'll use only once, to the detriment of everything else on
> the system, then one thing that might help could be posix_fadvise:
>
>     posix_fadvise(fd, ofs, len, POSIX_FADV_DONTNEED);

Sounds interesting, I'll try sticking that in the unuse_pack function
Friday.

> On the other hand, perhaps a more important question is: why does git
> feel like it needs to generate entirely new packs for each person
> doing a clone on your system?  Shouldn't it be reusing existing ones
> and just streaming them straight out to the recipient?

Ah interesting point. Two things make me suspect the mmap is not shared
between processes. One is that mmap is done with the MAP_PRIVATE flag
which according to the man page doesn't share between processes. The
second is that the mmap is done on a temporary file created by
odb_mkstemp, I don't believe the name is identical between the two
processes.

Take care,
-- 
Ivan Kanis
http://kanis.fr

Nobody ever went broke underestimating the intelligence of the
American public.
    -- H L Mencken 

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]