Re: Git as data archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08.12.2019 19:44, Andreas Kalz wrote:

Hi Andreas,

> @Thomas: are you Thomas Braun who studied at FH Regensburg?

nope, sorry.

> Well, currently the .git repository is 715GB and the maximum file size
> is 9.5GB, but I did not get error messages due to that even if the
> performance is quite low. The biggest pack* file is 24GB. There are some
> files which are modified, but most are not modified.

Okay that is kind-of-large. How did you add the 9.5GB file? AFAIK this
could not have be done on windows.

Do you push that to a remote repository as well?

> My question came up as I did not find a documentation about limits of
> git, only a lot of entries about github and forum users who are
> discussing about old bugs of git. I read about git-lfs and also that it
> is not working very stable, due to that I did not use it yet.

Although I'm not using git-lfs myself, from what I know it works well.
But it does have the same limitation as stock git for windows as Philip
pointed out already.

> How can the delta compression settings and/or the big filethreshold
> limits be modified?

These are plain git config settings. Have a look at [1]. The attributes
are explained in [2-3]. Basically you can set in .gitattributes

*.bin -delta, -diff

which would tell git that files with suffix bin should not be delta
compressed and are always binary.

You could also play around with turning compression completely off via
core.compression or pack.compression.

Hope that helps,
Thomas

PS: If you have resources to help fixing that long-standing bug in git
for windows, there is a PR open [4] which has a WIP version. But beware
you need good C skills and better-than-average git skills, or a
Santa-Claus-style bag with monetary resources.

[1]:
https://git-scm.com/docs/git-config#Documentation/git-config.txt-corebigFileThreshold
[2]: https://git-scm.com/docs/gitattributes#_code_delta_code
[3]: https://git-scm.com/docs/gitattributes#_marking_files_as_binary
[4]: https://github.com/git-for-windows/git/pull/2179

> Am 07.12.2019 um 19:04 schrieb Thomas Braun:
>> On 07.12.2019 17:54, Philip Oakley wrote:
>>> Hi Andreas,
>>>
>>> On 06/12/2019 18:54, Andreas Kalz wrote:
>>>> Hello,
>>>> I am using git as archive and versioning also for photos. Apart from
>>>> performance issues, I wanted to ask if there are hard limits and
>>>> configurable limits (how to configure?) for maximum single file size
>>>> and
>>>> maximum .git archive size (Windows 64 Bit system)?
>>>> Thanks in advance for your answer.
>>>> All the best,
>>>> Andreas
>>> On Git the file size is currently limited to size of `long`, rather than
>>> `size_t`. Hence on Git-for Windows the size limit is 32bit ~4GiB
>>>
>>> Any change will be a big change as it ripples through many places in the
>>> code base and, for some, will feel 'wrong'. I did some work [1-4] on top
>>> of those of many others that was almost there, but...
>> Adding to what Philip said. On Windows the size of exported archives
>> (git archive) is currently also limited to 4GB. The reason being also
>> the long vs size_t issue (which is not present on linux though).
>>
>> So if you can switch to Linux or even MacOSX these issues are gone.
>>
>> The number of files in .git, only the number packfiles would be of
>> interest here I guess, do not have the long vs size_t issue. So
>> packfiles can be larger than 4GB on 64bit Windows (with 64bit git of
>> course).
>>
>> But depending on how large the biggest files are, it might be worth
>> tweaking some of the settings, so that the created packfiles are
>> readable on all platforms. I once created a repo on linux which could
>> not be checked on windows, and that is a bit annoying.
>>
>> So the questions are how large is each file? And what repository size do
>> you expect? Are we talking about 20MB files and 10GB repository? Or a
>> factor 100 more? And are you just adding files or are you modifying the
>> added files? Depending on the file sizes it might then also be
>> beneficial to tweak the delta compression settings and/or the big file
>> threshold limits.
>>
>> Thomas
>>
>>> The alternative is git-lfs, which I don't personally use (see [4]).
>>>
>>> Philip
>>>
>>> [1] https://github.com/git-for-windows/git/pull/2179
>>> [2] https://github.com/gitgitgadget/git/pull/115
>>> [3] https://github.com/git-for-windows/git/issues/1063
>>> [4] https://github.com/git-lfs/git-lfs/issues/2434
>>>
>>>
> 
> 




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux