Re: [PATCH 1/2] Use git_open_noatime when accessing pack data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"Shawn O. Pearce" <spearce@xxxxxxxxxxx> writes:

> This utility function avoids an unnecessary update of the access time
> for a loose object file.  Just as the atime isn't useful on a loose
> object, its not useful on the pack or the corresonding idx file.
>
> Signed-off-by: Shawn O. Pearce <spearce@xxxxxxxxxxx>

Hearing the name "git-open-noatime", one would naturally assume that it is
a way to open files without burdening the filesystem with inode metadata
update traffic, as that was the original reason why we open loose objects
without atime update.  We historically anticipated to have very many of
the loose objects lying around, and this optimization made sense.

As any sane repository would have far fewer packfiles than loose objects,
one would think that, while it may not hurt, using git-open-noatime to
open packfiles is just a misguided performance measure.  Not.

This patch (and the next patch) adds "we unuse pack windows to retry
opening if we have too many files already open" logic, which is a lot more
important side effect, especially when this function is used for packfiles
(because they tend to stay open for a long time, unlike loose object files
that are opened, read/mapped, and then immediately closed) than what the
name of this function says it does.

Even though I think the issue you are solving is worth addressing, I do
not think I like the structure of the API resulting from these two
patches.  Most of the callers, except for the ones in check-packed-git-idx
and open-packed-git-1, do not care about "keeping one packfile" interface,
so I would prefer to see a two-patch series along the lines of ...

 (1) introduce "int git_open_ro(const char *)" to replace the current
     git_open_noatime().  The point is that the function no longer is
     about avoiding from smudging the inode metadata.  Instead, it becomes
     the preferred way for us to get a read-only fd.

 (2) call your git_open_noatime() implementation git_open_rowpf() or
     something.  Make git_open_ro() a thin wrapper of this function that
     passes NULL for its packed_git parameter.  Two callers that care
     about protecting a pack they are operating on will call this function
     directly.

We can of course do without s/git_open_noatime/git_open_ro/; and it will
make the patch much smaller.  The rename is purely a clarification of the
API and is optional.  It may make it easier to explain the name of the new
function, though.

By the way, I think I still owe you a patch to selectively pack-ref only
old ones.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]