Re: [PATCH/RFC 1/4] Optimised, faster, more effective symlink/directory detection

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano <gitster@xxxxxxxxx> writes:

> Let me see if I understand the logic behind this caching....
> ...
>> +		last_slash = match_len;
>> +		cache_path[last_slash] = '\0';
>> +
>> +		if (lstat(cache_path, &st)) {
>> +			ret_flags = LSTAT_LSTATERR;
>> +			if (errno == ENOENT || errno == ENOTDIR)
>> +				ret_flags |= LSTAT_NOTDIR;
>
> If you tested "A/B" here and got ENOENT back, you know "A/B" does not
> exist; you cache this knowledge as "A/B is not a directory" (I also think
> you could use it as a cached knowledge that "A exists and is a directory".
> I am not sure if you are taking advantage of that).
>
> What I do not understand about this code is the ENOTDIR case.  If you
> tested "A/B" and got ENOTDIR back, what you know is that "A" is not a
> directory (if the path tested this round were deeper like "X/Y/A/B", you
> know "X/Y/A" is not a directory, and you know "X" and "X/Y" are true
> directories; otherwise the loop would have exited before this round when
> you tested "X" or "X/Y" in the earlier rounds).
>
> So as far as I can think of, ENOENT case and ENOTDIR case would give you
> different information (ENOENT would say "A is a dir, A/B is not"; ENOTDIR
> would say "A is not a dir").  I am confused how you can cache the same
> path and same flag between these two cases here.

I take 1/4 of the above back.

If everything is working correctly, I do not think you will ever have a
situation where you check "A/B" here and get ENOTDIR back.  lstat("A/B")
would yield ENOTDIR if "A" is not a directory, but:

 (1) If you did test "A" in the earlier round in the loop, you would have
     already found it is not a directory and would have exited the loop
     with LSTAT_ERR.  You cannot test "A/B" in such a case;

 (2) If you did not test "A" in the loop, that has to be because you had a
     cached information for it, and the caching logic would have returned
     early if "A" was a non-directory without entering the loop; in other
     words, you would test "A/B" here without testing "A" in the loop only
     when the cache says "A" was a directory.  You cannot get ENOTDIR in
     such a case.

In any case, my main puzzlement still stands.  I think ENOTDIR case should
behave differently from ENOENT case.

And I think it is an indication of a grave error, either this code is
racing with an external process that is actively mucking with the work
tree to make your cached information unusable, or somebody forgot to call
clear_lstat_cache().

Hmm?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux