Re: [RFC/PATCH 0/1] cygwin: Remove the Win32 l/stat() functions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/27/2013 12:35 AM, Jeff King wrote:
> On Wed, Jun 26, 2013 at 10:45:48PM +0100, Ramsay Jones wrote:
> 
>>> This patch adds some *extra* cache invalidation that was heretofore
>>> missing.  If stat() is broken it could
>>>
>>> (a) cause a false positive, resulting in some unnecessary cache
>>> invalidation and re-reading of packed-refs, which will hurt performance
>>> but not correctness; or
>>>
>>> (b) cause a false negative, in which case the stale cache might be used
>>> for reading (but not writing), just as was *always* the case before this
>>> patch.
>>>
>>> As far as I understand, the concern for cygwin is (a).  I will leave it
>>> to others to measure and/or decide whether the performance loss is too
>>> grave to endure until the cygwin stat() situation is fixed.
>>
>> Hmm, I'm not sure I understand ... However, I can confirm that the
>> 'mh/ref-races' branch in next is broken on cygwin. (i.e. it is not
>> just a speed issue; it provokes fatal errors).
> 
> I think Michael's assessment above is missing one thing.

Peff is absolutely right; for some unknown reason I was thinking of the
consistency check as having been already fixed.

> However, when we have taken a lock on the file, there is an additional
> safety measure: if we find the file is changed, we abort, as that should
> never happen (it means somebody else modified the file while we had it
> locked). But of course Cygwin's false positive here triggers the safety
> valve, and we die without even realizing that nothing changed.
> 
> In theory we can drop the safety valve; it should never actually happen.
> But I'd like to keep it there for working systems. Perhaps it is worth
> doing something like this:
> 
> [...#ifdef out consistency check on cygwin when lock is held...]

Yes, this would work.

But, taking a step back, I think it is a bad idea to have an unreliable
stat() masquerading as a real stat().  If we want to allow the use of an
unreliable stat for certain purposes, let's have two stat() interfaces:

* the true stat() (in this case I guess cygwin's slow-but-correct
implementation)

* some fast_but_maybe_unreliable_stat(), which would map to stat() on
most platforms but might map to the Windows stat() on cygwin when so
configured.

By default the true stat() would always be used.  It should have to be a
conscious decision, taken only in specific, vetted scenarios, to use the
unreliable stat.

For example, I can't imagine that checking the freshness of the index or
of the packed-refs file is ever going to be a bottleneck, so there is no
reason at all to use an unreliable stat() here.

On the other hand, stat() seems definitely to be a bottleneck when
testing for changes in a 100,000 file working tree, and here occasional
mistakes might be considered acceptable.  So for this purpose the
unreliable stat() might be used.

Michael

-- 
Michael Haggerty
mhagger@xxxxxxxxxxxx
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]