Gelonida <gelonida@xxxxxxxxx> writes: > Jakub Narebski wrote: >> Gelonida <gelonida@xxxxxxxxx> writes: >> >>> We have a git repository, whose size we want to reduce drastically due >>> to frequent clone operations and a slow network connection. >> >> Why frequent *clone* operations, instead of using "git fetch" or >> equivalent ("git pull" which is fetch+merge, or "git remote update")? > > The clone is part of the deployment process and would IIRC be equivalent > to a 'svn export' > Almost certainly one can also improve this, but this should probably > discussed in another thread. > > The sequence on some remote hosts is. > - git clone tag dirname > - rm -rf dirname/.git > - tar cvfz dirname.tgz dirname Why not simply (after enabling 'upload-archive' service in git-daemon if you serve via git:// URL, and probably similar in the case of SSH access management by gitosis or gitolite) $ git archive --remote=<repo> <tag> (where <repo> is <dirname> in your example)? >>> The idea is following: >>> >>> * archive the git repository just in case we really have to go back in >>> history. >>> >>> >>> create a new git repository, which shall only contain last month's activity. >>> >>> all changes before should be squashed together. >>> It would be no problem if the very first commit remains unmodified. >> >> If you want to simply _remove_ history before specified commit, >> instead of squashing it, the best solution would be to use grafts to >> cauterize (cut) history, check using [graphical] history viewer that >> you cut it correctly, and then then use git-filter-branch to make this >> cut permanent. > > This sounds exactly as what I'd like to do. > I used "git gui" => "Visualize All Branch History" y to choose a nice > single cutoff point. > I just didn't know how to apply the cut. You can read about grafts in git-filter-branch(1) manpage, in gitrepository-layout(5) git repository layout description, and in gitglossary(7) a git glossary. In short, each line in .git/info/grafts consist of sha1 id of object, followed by space-separated list of its effective (grafted) parents. So to cut history e.g. after commit a3eb250f996bf5e, you need to put line containing only this SHA-1 in .git/info/grafts file, e.g.: $ git rev-parse --verify a3eb250f996bf5e >> .git/info/grafts > So the command to look for is git-filter-branch, right ? > I'll read the doc. As you would see in git-filter-branch(1) documentation, simple $ git filter-branch --all (no filter) would make history described by grafts permanent. Note that this will be rewriting history, and you would make it (much) harder on any contributor who based his/her work on commits from before "rebase". >> >> You can later use grafts or refs/replaces/ mechanism to join "current" >> history with historical repository. > > Probably we wont need this, but this sounds rather interesting and is > good to know. Grafts were for example used to fuse (join) current and historical Linux kernel repositories, after Linux kernel moved from BitKeeper to Git. The 'git-replace' mechanism is meant as modern, transferable and safe replacements for grafts file. -- Jakub Narebski Poland ShadeHawk on #git -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html