Stefan Beller <sbeller@xxxxxxxxxx> writes: > The pathspec mechanism is extended via the new > ":(attr:eol=input)pattern/to/match" syntax to filter paths so that it > requires paths to not just match the given pattern but also have the > specified attrs attached for them to be chosen. I was looking at preload-index.c and its history again, because I wanted to ensure that Lars's process filter stuff introduces no unexpected interactions with getting multi-threaded, similar to the problem we had in the earlier incarnations of this step, which we worked around with <xmqqshwvvaxq.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxx> "pathspec: disable preload-index when attribute pathspec magic is in use". I think that Lars's series is safe, because the only thing among what preload-index.c::preload_thread() does that can go outside the simple stat data and pattern matching with the pathname is the racy-git check in ie_match_stat(), and the object store access in that call was explicitly disabled by 7c4ea599b0 ("Fix index preloading for racy dirty case", 2008-11-17) long time ago. By passing CE_MATCH_RACY_IS_DIRTY option to the call, this caller effectively says "A cache entry whose stat data matches may be actually dirty when the timestamp is racy, in which case we usually compare data to determine if it really is clean, but it is OK to err on the safe and lazy side by declaring it dirty and not marking it up-to-date while we are preloading. Do not bother to go to the object store". The reason why I am bringing this up in this discussion thread on this patch is because I wonder if we would benefit by a similar "let's not do too involved things and be cheap by erring on the safe and lazy side" strategy in the call to ce_path_match() call made in this function to avoid making calls to the attr subsystem. In other words, would it help the system by either simplifying the processing done or reducing the cycle spent in preload_thread() if we could tell ce_path_match() "A pathspec we are checking may require not just the pattern to match but also attributes given to the path to satisfy the criteria, but for the purpose of preloading, pretend that the attribute satisfies the match criteria" (or "pretend that it does not match"), thereby not having to make any call into the attribute subsystem at all from this codepath? The strategy this round takes to make it unnecessary to punt preloading (i.e. dropping "pathspec: disable preload-index when attribute pathspec magic is in use" patch the old series had) is to make the attribute subsystem thread-safe. But that thread-safety in the initial round is based on a single Big Attribute Lock, so it may turn out that the end result performs better for this codepath if we did not to make any call into the attribute subsystem. I am probably being two step ahead of ourselves by saying the above, which is just something to keep in mind as a possible solution if performance in this preload codepath becomes an issue when the pathspec has attributes match specified.