Re: Git as data archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Thomas,
I committed it only on a local repository (git add . / git commit
-m"..."). But I never tested to restore the big files from the archive
:( and then I stepped over the bug description.

Now I tried it out and something bad happened:
E:\bilder_git>git checkout -- Hochzeitsmesse.mp4
error: bad object header
fatal: packed object 5c1403a85829c1c9e03bf04ac814d65bb72b617f (stored in
.git/objects/pack/pack-00246783dc8e6b7365220e75563b5cecfa358e11.pack) is
corrupt

During add / commit there was no problem, but now this is not a good
thing...

My C-Skills are not bad - I worked about 10 years in embedded SW
development. But, currently my time is limited as I have a 3 week old
baby child :)

All the best,
Andreas



Am 09.12.2019 um 02:18 schrieb Thomas Braun:
On 08.12.2019 19:44, Andreas Kalz wrote:

Hi Andreas,

@Thomas: are you Thomas Braun who studied at FH Regensburg?
nope, sorry.

Well, currently the .git repository is 715GB and the maximum file size
is 9.5GB, but I did not get error messages due to that even if the
performance is quite low. The biggest pack* file is 24GB. There are some
files which are modified, but most are not modified.
Okay that is kind-of-large. How did you add the 9.5GB file? AFAIK this
could not have be done on windows.

Do you push that to a remote repository as well?

My question came up as I did not find a documentation about limits of
git, only a lot of entries about github and forum users who are
discussing about old bugs of git. I read about git-lfs and also that it
is not working very stable, due to that I did not use it yet.
Although I'm not using git-lfs myself, from what I know it works well.
But it does have the same limitation as stock git for windows as Philip
pointed out already.

How can the delta compression settings and/or the big filethreshold
limits be modified?
These are plain git config settings. Have a look at [1]. The attributes
are explained in [2-3]. Basically you can set in .gitattributes

*.bin -delta, -diff

which would tell git that files with suffix bin should not be delta
compressed and are always binary.

You could also play around with turning compression completely off via
core.compression or pack.compression.

Hope that helps,
Thomas

PS: If you have resources to help fixing that long-standing bug in git
for windows, there is a PR open [4] which has a WIP version. But beware
you need good C skills and better-than-average git skills, or a
Santa-Claus-style bag with monetary resources.

[1]:
https://git-scm.com/docs/git-config#Documentation/git-config.txt-corebigFileThreshold
[2]: https://git-scm.com/docs/gitattributes#_code_delta_code
[3]: https://git-scm.com/docs/gitattributes#_marking_files_as_binary
[4]: https://github.com/git-for-windows/git/pull/2179

Am 07.12.2019 um 19:04 schrieb Thomas Braun:
On 07.12.2019 17:54, Philip Oakley wrote:
Hi Andreas,

On 06/12/2019 18:54, Andreas Kalz wrote:
Hello,
I am using git as archive and versioning also for photos. Apart from
performance issues, I wanted to ask if there are hard limits and
configurable limits (how to configure?) for maximum single file size
and
maximum .git archive size (Windows 64 Bit system)?
Thanks in advance for your answer.
All the best,
Andreas
On Git the file size is currently limited to size of `long`, rather than
`size_t`. Hence on Git-for Windows the size limit is 32bit ~4GiB

Any change will be a big change as it ripples through many places in the
code base and, for some, will feel 'wrong'. I did some work [1-4] on top
of those of many others that was almost there, but...
Adding to what Philip said. On Windows the size of exported archives
(git archive) is currently also limited to 4GB. The reason being also
the long vs size_t issue (which is not present on linux though).

So if you can switch to Linux or even MacOSX these issues are gone.

The number of files in .git, only the number packfiles would be of
interest here I guess, do not have the long vs size_t issue. So
packfiles can be larger than 4GB on 64bit Windows (with 64bit git of
course).

But depending on how large the biggest files are, it might be worth
tweaking some of the settings, so that the created packfiles are
readable on all platforms. I once created a repo on linux which could
not be checked on windows, and that is a bit annoying.

So the questions are how large is each file? And what repository size do
you expect? Are we talking about 20MB files and 10GB repository? Or a
factor 100 more? And are you just adding files or are you modifying the
added files? Depending on the file sizes it might then also be
beneficial to tweak the delta compression settings and/or the big file
threshold limits.

Thomas

The alternative is git-lfs, which I don't personally use (see [4]).

Philip

[1] https://github.com/git-for-windows/git/pull/2179
[2] https://github.com/gitgitgadget/git/pull/115
[3] https://github.com/git-for-windows/git/issues/1063
[4] https://github.com/git-lfs/git-lfs/issues/2434






[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux