Re: Is cp -al safe with git?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Thu, 16 Nov 2006, Johannes Sixt wrote:
>
> For one reason or another I would like to "clone" a local repo including the
> checked-out working tree with cp -al instead of cg-clone/git-clone, i.e.
> have all files hard-linked instead of copied.

It works, but I don't think you should depend on it.

> Can the copies be worked on independently without interference (with the git
> tool set)?

We _tried_ to make sure it is ok, but since it's not a normal mode of 
operation, I would not guarantee it.

> One thing I noticed is that git-reset or probably git-checkout-index breaks
> links of files that need not be changed by the reset.

Yes and no. They do _not_ actually break links of files that they know 
stay the same, but your example breaks the internal knowledge by using 
that "cp -al". That changes the modification time of the inodes, so git 
thinks that the files _may_ have changed, and when you do a "git reset", 
it will overwrite them all.

> Example:
> 
> # make 2 files, commit
> $ mkdir orig && cd orig
> $ git-init-db 
> defaulting to local storage area
> $ echo foo > a && cp a b && git-add a b && git-commit -a -m 1
> Committing initial tree 99b876dbe094cb7d3850f1abe12b4c5426bb63ea
> 
> # 2nd commit modifies only one file:
> $ echo bar > a && git-commit -a -m 2
> 
> # create the copy:
> $ cd ..
> $ cp -al orig copy
> $ cd copy
> 
> # working files are hard-linked:
> $ ls -l
> total 8
> -rw-r--r-- 2 jsixt users 4 Nov 16 19:24 a
> -rw-r--r-- 2 jsixt users 4 Nov 16 19:23 b
> 
> # nuke a commit:
> $ git-reset --hard HEAD^
> $ ls -l
> total 8
> -rw-r--r-- 1 jsixt users 4 Nov 16 19:24 a
> -rw-r--r-- 1 jsixt users 4 Nov 16 19:24 b
> 
> I'd have expected that the hard-link of b remained and only a's link were
> broken. Does it mean that git-reset writes every single file also for large
> trees like the kernel? I cannot believe this. Can someone scratch the
> tomatoes off my eyes please?

If you do a

	git update-index --refresh

(or, more easily, a "git status", which will do the index refresh for you) 
before you do the "git reset", you will get:

	$ ls -l
	total 8
	-rw-r--r-- 1 jsixt users 4 Nov 16 19:24 a
	-rw-r--r-- 2 jsixt users 4 Nov 16 19:24 b

like you want to. The reason "git reset" overwrites _both_ files in your 
example is that the stat() information for those files changed, so "git 
reset" thinks they are both dirty and both need to be rewritten.

That said, I would seriously suggest that you try these things out, and 
realize that most people do _not_ use the hardlinked approach. For all I 
know, some piece of git might change some files in-place. I don't _think_ 
we do, and it would strictly speaking be a bug, but because people don't 
use it that way, you'd be the guinea pig.

I think we'll happily fix any bugs you find, but that may not make you any 
happier if the bug corrupted your lifes work ;)

In general, you might want to use

	git clone -l -s

instead, but that will _not_ hardlink the actual checked-out contents, so 
it's not going to get the kind of sharing you look for. On the other hand, 
especially with good maintenance (doing "git repack -l -d -a" etc), you 
may end up sharing _more_ that way at least in the repository object 
database (but never in the actual checked-out directories).

		Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]