On Mon, 11 Feb 2008, Jakub Narebski wrote: > On Sun, 10 Feb 2008, Sean napisał: > > On Sun, 10 Feb 2008 00:22:09 -0500 (EST) > > Nicolas Pitre <nico@xxxxxxx> wrote: > > > >> Finding out what those huge objects are, and if they actually need to be > >> there, would be a good thing to do to reduce any repository size. > > > > Okay, i've sent the sha1's of the top 500 to Jan for inspection. It appears > > that many of the largest objects are automatically generated i18n files that > > could be regenerated from source files when needed rather than being checked > > in themselves; but that's for the OO folks to decide. > > Good practice is to not add generated files to version control. > But sometimes such files are stored if regenerating them is costly > (./configure file in some cases, 'man' and 'html' branches in git.git). > > IIRC Dana How tried also to deal with repository with large binary > files in repo, although in that case those had shallow history. IIRC > the proposed solution was to pack all such large objects undeltified > into separate "large-objects" kept pack. That was to solve a completely different problem which wasn't about space saving, but rather to save on 'git push' latency. Nicolas