On 01/23, Junio C Hamano wrote: > Brandon Williams <bmwill@xxxxxxxxxx> writes: > > > The last big hurdle towards a thread-safe API for the attribute system > > is the reliance on a global attribute stack that is modified during each > > call into the attribute system. > > The same comment as 22/27 applies here. > > It is not an immediate problem we need to solve in the scope of this > series, in the sense that a Big Subsystem Lock for the attribute > subsystem around git_check_attr() function can make it thread-safe. > > But if we want to make it truly threadable without a Big Subsystem > Lock, this and the other one would need to become per-thread at > least. I think the check_all_attrs scoreboard, which is the topic > of 22/27, should become per git_check_attr() invocation (immediately > before making a call to collect_some_attrs(), prepare an array with > map.size elements and use that as a scoreboard, for example). I do > not think we can be sure that the "slimmed down attr stack" 15/27 > envisions would help performance without benchmarking, but if it > does, then the "attr stack that holds entries that are relevant to > the current query" would have to become per <thread, check> pair, as > two threads may be executing the same codepath looking for the same > set of attributes (i.e. sharing a single attr_check instance), but > working on two different parts of a tree structure. > > > This patch removes this global stack and instead a stack is stored > > locally in each attr_check instance. This opens up the opportunity for > > future optimizations to customize the attribute stack for the attributes > > that a particular attr_check struct is interested in. > > This is still true. But two threads hitting the same attr_check > would make the stack thrash between the paths they are working on to > hurt performance once we go multi-threaded. > > Perhaps, provided if the "slimmed down attr stack" is indeed a good > idea, we should keep the global hashmap that holds everything we > read from .gitattributes tree-wide (i.e. as in your v1), _and_ > introduce a mechanism to keep the slimmed down version that is > relevant to check[] for each thread somehow. Sounds good, I'll reintroduce the hashmap of stacks that I had in v1 and instead make the all_attrs array that is used the in collection process allocated at invocation time. That will cause a bit of allocation churn but in reality shouldn't make that much of an impact. As we discussed off-line I'll also do the rework to break up the question and result. That way two threads can be executing using the same attr_check structure. -- Brandon Williams