Jeff King <peff@xxxxxxxx> writes: > On Thu, Jul 24, 2008 at 06:41:03PM +0100, Johannes Schindelin wrote: > >> > As a user, I would expect "sparse clone" to also be sparse on the >> > fetching. That is, to not even bother fetching tree objects that we are >> > not going to check out. But that is a whole other can of worms from >> > local sparseness, so I think it is worth saving for a different series. >> >> I think this is not even worth of a series. Sure, it would have benefits >> for those who want sparse checkouts. But it comes for a high price on >> everyone else: > > I agree there are a lot of issues. I am just thinking of the person who > said they had a >100G repository. But I am also not volunteering to do > it, so I will let somebody who really cares about it try to defend the > idea. I think sparse fetch is a lot worse than grafts and shallow clones which are already bad. These are all ways to introduce local inconsistency at the object level and pretend everything is Ok, but the latter two do so only at commit boundary and it is somewhat more manageable (but we still do not handle it very well). With sparse fetch, you cannot even guarantee the integrity of individual commits with subtrees here and there missing. I do think shallow checkout that says "I'll have the whole tree in the index but the work tree will have only these paths checked out" makes sense. You do not need a fully populated work tree to create commits or merges -- the only absolute minimum you need is a fully populated index. In that sense, I think "protect index entries outside of these paths" (I remember that the first round of this series was done around that notion) is a wrong mentality to handle this. We should think of this as more like "you still populate the index with the whole tree, and you are free to update them in any way you want, but we do not touch work tree outside these areas". This has a few ramifications: - If the user can somehow check out a path outside the "sparse" area, it is perfectly fine for the user to edit and "git add" it. Such a method to check out a path outside the "sparse" area is a way to widen the "sparse" area the user originally set up; - When the user runs "merge", and it needs to present the user a working tree file because of conflicts at the file level, the user has to agree to widen the "sparse" area before being able to do so. One way to do this is to refuse and fail the merge (and then the user needs to do that "unspecified way" of widening the "sparse" area first). Another way would be to automatically widen the "sparse" area to include such conflicting paths. - And you would want to narrow it down after you do such a widening. For many projects that has src/ and doc/ (git.git being one of them), it is perfectly valid for a code person and a doc person to work in tandem. In such a project, after the code person makes changes in her sparsely checked out repository and making changes only to the src/ area and pushes the results out, the doc person would run "git pull && git log -p ORIG_HEAD" and updates the documentation in his sparsely checked out repository that has only doc/ area. The two parts are tied together and they advance more or less in sync. I think sparse checkout would be a useful feature to help such a configuration. Having said that, I however think that this can easily be misused as a CVS style "one CVSROOT houses millions of totally unrelated projects" layout. In CVS, the layout is perfectly fine because the system does not track changes at anything higher than the level of individual files, but when you naïvely map the layout to a system with tree-wide atomic commits, such as git, it will defeat the whole point of using such a system. The pace these millions of unrelated projects advance do not have any relationship with each other, but by tying them together in the same top-level tree, the layout is introducing an unnecessary ordering between their commits. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html