Re: [PATCH] checkout: most of the time we have good leading directories

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thomas Rast <tr@xxxxxxxxxxxxx> writes:

> Junio C Hamano <gitster@xxxxxxxxx> writes:
>
>> When "git checkout" wants to create a path, e.g. a/b/c/d/e, after
>> seeing if the entire thing already exists (in which case we check if
>> that is up-to-date and do not bother to check it out, or we unlink
>> and recreate it), we validate that the leading directory path is
>> without funny symlinks by seeing a/, a/b/, a/b/c/ and then a/b/c/d/
>> are all without funny symlinks, by calling has_dirs_only_path() in
>> this order.
>>
>> When we are checking out many files (imagine: initial checkout),
>> however, it is likely that an earlier checkout would have already
>> made sure that the leading directory a/b/c/d/ is in good order; by
>> first checking the whole path a/b/c/d/ first, we can often bypass
>> calls to has_dirs_only_path() for leading part.
>
> Naively one would think that this is just as much work -- to correctly
> verify that the path consist only of actual directories (not symlinks)
> we have to lstat() every component regardless.  It seems the reason this
> is an optimization is that has_dirs_only_path() caches its results, so
> that we can get 'a/b/c/d/ is okay in every component' from the cache.
>
> Is this analysis correct?  If so, can you spell that out in the commit
> message?

It was done without analysis ;-) but I think you are correct.

If you are checking out a/b/c/d/{m,a,n,y}, after you checked out
a/b/c/d/m, the has_dirs_only_path cache knows a/b/c/d/ is in good
order so when you check out a/b/c/d/{a,n,y}, we can just ask for
a/b/c/d/ and get an OK immediately.  There is no point asking from
a/, a/b/, a/b/c/ and then a/b/c/d/, in the original pessimistic
order.  A change done _right_ to properly optimize this might even
want to change the main loop that the patch bypassed.

I do not think the patch (or the "change done right" for that
matter) will make much difference on a platform with good filesystem
metadata caching. It may be very interesting to see if that simple
patch makes any difference on Windows, though. If it does, then we
may want to look into cleaning up the code further.

Thanks for a comment.



--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]