Re: Avoid race condition between fetch and repack/gc?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 16, 2020 at 04:40:13PM -0700, Bryan Turner wrote:

> On Mon, Mar 16, 2020 at 10:27 AM Jeff King <peff@xxxxxxxx> wrote:
> >
> >   - you don't use config like core.packedGitLimit that encourages Git to
> >     drop mmap'd windows. On 64-bit systems, this doesn't help at all (we
> >     have plenty of address space, and since the maps are read-only, the
> >     OS is free to drop the clean pages if there's memory pressure). On
> >     32-bit systems, it's a necessity (and I'd expect a 32-bit server to
> >     run into this issue a lot for large repositories).
> 
> I could be wrong, but I'm going to _guess_ from the :7999 in the
> example output that the repository is hosted with Bitbucket Server.
> Assuming my guess is correct, Bitbucket Server _does_ set
> "core.packedgitlimit=256m" (and "core.packedgitwindowsize=32m", for
> what it's worth).

Thanks for letting me know. That would definitely explain the behavior
Andreas is seeing.

> Those settings are applied specifically because
> we've found they _do_ impact Git's overall memory usage when serving
> clones in particular, which is important for cases where the system is
> serving dozens (or hundreds) of concurrent clones.

I do think there's an open question here, but it's also really easy to
be misled by common metrics. If we imagine a repo with a 512MB packfile
and two scenarios:

  - Git mmap's the whole packfile at once, and accesses pages within
    that mmap

  - Git mmap's no more than 256MB at once, shifting its window around as
    necessary

and then we get a bunch of clones. Then I'd expect to see:

  - virtual memory size for those Git processes will be higher. But much
    of that will be shared pages with each other. Measuring something
    like Proportional Set Size (PSS) yields a more useful number.

  - the Resident Set Size (RSS) of those processes will also be higher,
    because the OS may be leaving pages in the mmap resident. However,
    in my experience this is often a sign that there _isn't_ memory
    pressure on the system. Because if there was, then infrequently used
    pages would be dropped by the OS (and they should be among the first
    to go, as they're by definition clean pages. Though they do
    presumably compete with other read-only disk cache).

Or another way to think about it: the mmap patterns don't change the
working set patterns of Git. They just change what the OS knows, and
possibly how it reacts. And that's the "open question" for me: does the
operating system react significantly differently under memory pressure
for a big mmap with infrequently accessed pages than it would for a
series of smaller maps. I don't know the answer. Our servers are all
Linux, and we've tended to just trust that the operating system's
page-level decisions are sensible.

But I don't have any real numbers to support that. We stopped using
core.packedGitLimit in 2012 (because of this issue), and everything has
been good enough since to not bother looking into it more. Memory
pressure for us is usually from actual heap usage (e.g., pack-objects
has a high per-object cost plus delta window size times max object
size).

> Of course, we don't always do a great job of re-testing
> once-beneficial settings later, which means sometimes they end up
> being based on outdated observations. Perhaps we should prioritize
> some additional testing here, especially on 64-bit systems. (We've
> been setting "core.packedgitlimit" since back when Bitbucket Server
> was called Stash and supported Git 1.7.6 on the server.)

It sounds like you don't have any recent numbers, either. :)

> That said, though, you note "core.packedgitlimit" is necessary on
> 32-bit servers and, unfortunately, we do still support Bitbucket
> Server on 32-bit OSes. Maybe we should investigate applying (or not)
> the flags depending on the platform. Sadly, that's not necessarily
> simple to do since just because the _OS_ is 64-bit doesn't mean _Git_
> is; it's pretty trivial to run 32-bit Git on 64-bit Windows, for
> example.

The default for core.packedGitLimit is already 256MB on 32-bit builds of
Git (based on sizeof(void *), so truly on the build and not the OS). So
if you just left it unset, it would do what you want.

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux