Hi Karsten, On 2015-09-28 12:39, Karsten Blees wrote: > Different git variants record file times in the index with different > precisions, according to their capabilities. E.g. git compiled with NO_NSEC > records seconds only, JGit records the mtime in milliseconds, but leaves > ctime blank (because ctime is unavailable in Java). > > This causes performance issues in git compiled with USE_NSEC, because index > entries with such 'incomplete' timestamps are considered dirty, triggering > unnecessary content checks. > > Add a file time comparison function that auto-detects the precision based > on the number of trailing 0 digits, and compares with the lower precision > of both values. This initial version supports the known precisions seconds > (git + NO_NSEC), milliseconds (JGit) and nanoseconds (git + USE_NSEC), but > can be easily extended to e.g. microseconds. > > Use the new comparison function in both dirty and racy checks. As a side > effect, this fixes racy detection in USE_NSEC-enabled git with > core.checkStat=minimal, as the coreStat setting now affects racy checks as > well. > > Finally, do not check ctime if ctime.sec is 0 (as recorded by JGit). Great analysis, and nice patch. I would like to offer one suggestion in addition: > diff --git a/read-cache.c b/read-cache.c > index 87204a5..3a4e6cd 100644 > --- a/read-cache.c > +++ b/read-cache.c > @@ -99,23 +99,50 @@ void fill_stat_data(struct stat_data *sd, struct stat *st) > sd->sd_size = st->st_size; > } > > +/* > + * Compares two file times. Returns 0 if equal, <0 if t1 < t2, >0 if t1 > t2. > + * Auto-detects precision based on trailing 0 digits. Compares seconds only if > + * core.checkStat=minimal. > + */ > +static inline int cmp_filetime(uint32_t t1_sec, uint32_t t1_nsec, > + uint32_t t2_sec, uint32_t t2_nsec) { > +#ifdef USE_NSEC > + /* > + * Compare seconds and return result if different, or checkStat=mimimal, > + * or one of the time stamps has second precision only (nsec == 0). > + */ > + int diff = t1_sec - t2_sec; > + if (diff || !check_stat || !t1_nsec || !t2_nsec) > + return diff; > + > + /* > + * Check if one of the time stamps has millisecond precision only (i.e. > + * the trailing 6 digits are 0). First check the trailing 6 bits so that > + * we only do (slower) modulo division if necessary. > + */ > + if ((!(t1_nsec & 0x3f) && !(t1_nsec % 1000000)) || > + (!(t2_nsec & 0x3f) && !(t2_nsec % 1000000))) > + /* Compare milliseconds. */ > + return (t1_nsec - t2_nsec) / 1000000; > + > + /* Compare nanoseconds */ > + return t1_nsec - t2_nsec; > +#else > + return t1_sec - t2_sec; > +#endif > +} As this affects only setups where the same repository is accessed via clients with different precision, would it make sense to hide this behind a config option? I.e. something like static int cmp_filetime_precise(uint32_t t1_sec, uint32_t t1_nsec, uint32_t t2_sec, uint32_t t2_nsec) { #ifdef USE_NSEC return t1_sec != t2_sec ? t1_sec - t2_sec : t1_nsec - t2_nsec; #else return t1_sec - t2_sec; #endif } static int cmp_filetime_mixed(uint32_t t1_sec, uint32_t t1_nsec, uint32_t t2_sec, uint32_t t2_nsec) { #ifdef USE_NSEC ... detect lower precision and compare with lower precision only... #else return t1_sec - t2_sec; #endif } static (int *)cmp_filetime(uint32_t t1_sec, uint32_t t1_nsec, uint32_t t2_sec, uint32_t t2_nsec) = cmp_filetime_precise; ... modify cmp_filetime_precise if core.mixedTimeSpec = true... Otherwise there would be that little loop-hole where (nsec % 1000) == 0 *by chance* and we assume the timestamps to be identical even if they are not. Ciao, Dscho -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html