On Wed, Jul 14 2021, Taylor Blau wrote: > On Thu, Jul 15, 2021 at 12:46:47AM +0800, Sun Chao wrote: >> > Stepping back, I'm not sure I understand why freshening a pack is so >> > slow for you. freshen_file() just calls utime(2), and any sync back to >> > the disk shouldn't need to update the pack itself, just a couple of >> > fields in its inode. Maybe you could help explain further. >> > >> > [ ... ] >> >> The reason why we want to avoid freshen the mtime of ".pack" file is to >> improve the reading speed of Git Servers. >> >> We have some large repositories in our Git Severs (some are bigger than 10GB), >> and we created '.keep' files for large ".pack" files, we want the big files >> unchanged to speed up git upload-pack, because in our mind the file system >> cache will reduce the disk IO if a file does not changed. >> >> However we find the mtime of ".pack" files changes over time which makes the >> file system always reload the big files, that takes a lot of IO time and result >> in lower speed of git upload-pack and even further the disk IOPS is exhausted. > > That's surprising behavior to me. Are you saying that calling utime(2) > causes the *page* cache to be invalidated and that most reads are > cache-misses lowering overall IOPS? > > If so, then I am quite surprised ;). The only state that should be > dirtied by calling utime(2) is the inode itself, so the blocks referred > to by the inode corresponding to a pack should be left in-tact. > > If you're on Linux, you can try observing the behavior of evicting > inodes, blocks, or both from the disk cache by changing "2" in the > following: > > hyperfine 'git pack-objects --all --stdout --delta-base-offset >/dev/null' > --prepare='sync; echo 2 | sudo tee /proc/sys/vm/drop_caches' > > where "1" drops the page cache, "2" drops the inodes, and "3" evicts > both. > > I wonder if you could share the results of running the above varying > the value of "1", "2", and "3", as well as swapping the `--prepare` for > `--warmup=3` to warm your caches (and give us an idea of what your > expected performance is probably like). I think you may be right narrowly, but wrong in this context :) I.e. my understanding of this problem is that they have some incremental backup job, e.g. rsync without --checksum (not that doing that would help, chicken & egg issue).. So by changing the mtime you cause the file to be re-synced. Yes Linux (or hopefully any modern OS) isn't so dumb as to evict your FS cache because of such a metadata change, but that's besides the point. If you have a backup job like that your FS cache will get evicted or be subject to churn anyway, because you'll shortly be having to deal with the "rsync" job that's noticed the changed mtime competing for caching resources with "real" traffic. Sun: Does that summarize the problem you're having? <large digression ahead> Sun, also: Note that in general doing backups of live git repositories with rsync is a bad idea, and will lead to corruption. The most common cause of such corruption is that a tool like "rsync" will iterate recursively through say "objects" followed by "refs". So by the time it gets to the latter (or is doing a deep iteration within those dirs) git's state has changed in such a way as to yield an rsync backup in a state that the repository was never in. (As an aside, I've often wondered what it is about git exactly makes people who'd never think of doing the same thing with the FS part of an RDMBS's data store think that implementing such an ad-hoc backup solution for git would be a good idea, but I digress. Perhaps we need more scarier looking BerkeleyDB-looking names in the .git directory :) Even if you do FS snapshots of live git repositories you're likely to get corruption, search this mailing list for references to fsync(), e.g. [1]. In short, git's historically (and still) been sloppy about rsync, and relied on non-standard behavior such as "if I do N updates for N=1..100, and fsync just "100", then I can assume 1..99 are fsynced (spoiler: you can't assume that). Our use of fsync is still broken in that sense today, git is not a safe place to store your data in the POSIXLY pedantic sense (and no, I don't just mean that core.fsyncObjectFiles is `false` by default, it only covers a small part of this, e.g. we don't fsync dir entries even with that). On a real live filesystem this is usually not an issue, because if you're not dealing with yanked power cords (and even then, journals might save you), then even if you fsync a file but don't fsync the dir entry it's in, the FS is usually forgiving about such cases. I.e. if someone does a concurrent request for the could-be-outdated dir entry they'll service the up-to-date one, even without that having been fsync'd, because the VFS layer isn't going to the synced disk, it's checking it's current state and servicing your request from that. But at least some FS snapshot implementations have a habit of exposing the most pedantic interpretation possible of FS semantics, and one that you wouldn't ever get on a live FS. I.e. you might be hooking into the equivalent of the order in which things are written to disk, and end up with a state that would never have been exposed to a running program (there would be a 1=1 correspondence if we fsync'd properly, which we don't). The best way to get backups of git repositories you know are correct are is to use git's own transport mechanisms, i.e. fetch/pull the data, or create bundles from it. This would be the case even if we fixed all our fsync issues, because doing so wouldn't help you in the case of a bit-flip, but an "index-pack" on the other end will spot such issues. 1. https://lore.kernel.org/git/20200917112830.26606-2-avarab@xxxxxxxxx/