On Wed, 27 Sep 2017 15:09:43 -0400 Jeff Hostetler <git@xxxxxxxxxxxxxxxxx> wrote: > By adding it to the set of provisionally omitted objects, we > have the option to capture a little extra information with it > and refer to that the next time we see the object in the traversal. > For example, in the sparse-checkout case, the first time we see the > object we know the pathname and know that it does not need to be > included. The second time we see that object, we can see if the > new pathname is the same as the previous one with a simple strcmp > and avoid the expensive is_excluded_from_list() computation. Keep > in mind that rev-list or pack-objects could be called be on something > like HEAD~100000..HEAD or that there may be 50,000 tips. So a file > that doesn't change across that range will be visited many times > with the same {pathname, sha}. Ah, capturing the extra information makes sense. I missed that detail. > Right now I want to force the tree to be shown the first time it is > visited (because I don't want to do tree filtering yet). I don't mark > it SEEN yet because we may want to revisit blobs within (say, after a > folder rename like I described previously). > > I do, however, mark the tree object as SEEN (in the _END event) when I > can verify that I've included ALL of the children. This optimization makes sense too. > So it might be possible that I could change the flags and not use > FILTER_REVISIT on tree objects, I hesitate to do that right now. You're probably right that we need some sort of flag on tree objects, and FILTER_REVISIT can do the job. (My suggestion SHOWN plays a similar role anyway.) > Having the FILTER_REVISIT flag on blob objects means I can avoid > doing a hash/oidset lookup on subsequent visits. By the hash/oidset lookup, I presume you mean the lookup on the set of provisionally omitted objects? If yes, this makes sense. Thanks for your clarifications - I'll take another look at the code here.