Re: git gc does not clean tmp_pack* files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yes, if the behavior in case of running out of disk space is to just
leave the malformed file there, it stands to reason that cleaning up
those malformed files should be the first operation to do for gc.
At the very least, git should notify the user that they've got all of
those tmp_pack files totaling 20+ GB in the object folder before it
will declare that it can't write a single byte into a lock file
because previous "git gc" calls exhausted all the disk space.
I know that on Windows it's possible to take an exclusive write lock
on a file while the process is running, so at least on Windows those
tmp_pack files could be "soft-try" cleaned up without affecting other
running git processes, not sure if it's possible for other supported
OSes.
https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-lockfileex

-Vitaly


>
> On Wed, Dec 18, 2024 at 06:19:06PM -0800, Boomman wrote:
>
> > D:\Platform>dir .git\objects\pack\tmp*
> >  Directory of D:\Platform\.git\objects\pack
> >
> > 12/18/2024  05:33 PM     7,367,032,832 tmp_pack_FG1inp
> > 12/18/2024  05:35 PM     3,787,194,368 tmp_pack_IFvamY
> > 12/18/2024  05:39 PM     7,713,062,912 tmp_pack_khHCC9
> > 09/11/2024  11:33 AM     3,068,002,304 tmp_pack_XTVFUi
> >                4 File(s) 21,935,292,416 bytes
> >                0 Dir(s)         339,968 bytes free
> >
> > I believe that before trying to write *anything* to disk "git gc"
> > should try to take exclusive handles on these and wipe them, ideally
> > by default. The total size of these tmp* files is multiple times
> > larger than the repo I'm trying to compact, so if the command just did
> > this pre-cleaning I'd not have hit this problem once I cleaned enough
> > disk space.
>
> git-gc does know how to clean up these files, but they are subject to
> the same mtime grace period that loose objects are. This is to avoid
> deleting a file that is being actively used by a simultaneous process.
>
> Try "git gc --prune=now" if you know there are no other active processes
> in the repository.
>
> We usually prune things after finishing the repack. So if you're running
> out of disk space to repack, there might be a chicken-and-egg problem.
> You can run "git prune" manually in that case.
>
> Possibly git-gc should prune first for this reason, but I'd be hesitant
> to do so for actual loose objects. It's a little weird that tempfile
> cleanup is lumped in with loose object cleanup, and is mostly
> historical. Possibly those should be split.
>
> -Peff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux