Re: Configuring git to for forget removed files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Andy,

On Sat, Feb 20, 2010 at 5:37 AM, Andrew Benton <b3nton@xxxxxxxxx> wrote:
> I have a project that I store in a git repository. It's a bunch of source
> tarballs and some bash scripts to compile it all. Git makes it easy to
> distribute any changes I make across the computers I run. The problem I have
> is that over time the repository gets ever larger. When I update to a newer
> version of something I git rm the old tarball but git still keeps a copy and
> the folder grows ever larger. At the moment the only solution I have is to
> periodically rm -rf .git and start again. This works but is less than ideal
> because I lose all the history for my build scripts.
>
> What I would like is to be able to tell git to not keep a copy of anything
> that has been git rm. The build scripts never get removed, only altered so
> their history would be preserved. Is it possible to make git delete its backup
> copies of removed files?

I don't know if I can really speak to your hoped for conclusion
although `git filter-branch` is where you want to look for rewriting
history.  However, that's also an entirely impractical solution if
your repo is at all public because it would completely break sharing.

That being said, have you thought of changing your repo strategy?
IMHO, storing binary blobs that change at all regularly in _any_ SCMS
is a problem waiting to happen.  It's different if you have assets
that are fairly stable like images for a system's UI or dependencies
that have been stabilized, but that doesn't sound like your situation.

As a thought, why not try to do something along the lines of
maintaining a symlink to whatever tarballs your project currently
depends on as a 'foolib-latest' and then having a separate directory
that has a file that you can change.  You could maintain backups of
that using a tool like rsync (since you obviously aren't concerned
with maintaining history there) rather than git.  Then you could
decide arbitrarily how many backups you want to make and try to
maintain what version of the file went with which commit in your repo.
 The main problem I see with that is that you loose a lot of the
advantages of having a SCMS because you can't reliably checkout a
previous commit and build it; at least not without some very serious
effort.

Another possible solution if you maintain the sources that are
generating the tarballs is to treat the tarballs as artifacts of the
build rather than as assets that should be managed by the SCMS.  In
that way, you might spend more time during each build but your repo
would be much cleaner and would have the added advantage of being able
to completely build itself at every commit point.

Anyway, just food for thought.


-- 

In Christ,

Timmy V.

http://burningones.com/
http://five.sentenc.es/ - Spend less time on e-mail
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]