Re: [PATCH v3 00/27] Revamp the attribute system; another round

Brandon Williams <bmwill@xxxxxxxxxx> · Thu, 9 Feb 2017 09:18:04 -0800

On 02/02, Junio C Hamano wrote:
> 	prepare the <question, answer> array
> 	for path taken from some set:
> 		do_something(the array, path)
> 
> That way, do_something() do not have to keep allocating,
> initializing and destroying the array.
> 
> But after looking at the current set of codepaths, before coming to
> the conclusion that we need to split the static part that is
> specific to the callsite for git_check_attr() and the dynamic part
> that is specific to the <callsite, thread> pair, I noticed that
> typically the callers that can prepare the array before going into
> the loop (which will eventually be spread across multiple threads)
> are many levels away in the callchain, and they are not even aware
> of what attributes are going to be requested in the leaf level
> helper functions.  In other words, the approach to hoist "the
> <question, answer> array" up in the callchain would not scale.  A
> caller that loops over paths in the index and check them out does
> not want to know (and we do not want to tell it) what exact
> attributes are involved in the decision convert_to_working_tree()
> makes for each path, for example.

This was something that I was envisioning as well, though I didn't dig
very deep into the call stack.  Another means of doing this could be to
have the attr_check structure allocated and then have it configured at a
later point for the particular question being asked:

  alloc struct attr_check c;
  ... many call sites down
  configure(c, questions)
  for path
    do_something(c, path)

That also allows the same structure to be reused (just reconfigured) if
different attributes are needed at a later point in time.  Of course
this is just an idea and I'm not sure if this is the best way to do it
either.

> 
> So how would we split questions and answers in a way that is not
> naive and inefficient?  
> 
> I envision that we would allow the attribute subsystem to keep track
> of the dynamic part, which will receive the answers, holds working
> area like check_all_attr[], and has the equivalent to the "attr
> stack", indexed by <thread-id, callsite> pair (and the
> identification of "callsite" can be done by using the address of the
> static part, i.e. the array of questions that we initialize just
> once when interning the list of attribute names for the first time).
> 
> The API to prepare and ask for attributes may look like:
> 
> 	static struct attr_static_part Q;
> 	struct attr_dynamic_part *D;
> 
> 	attr_check_init(&Q, "text", ...);
> 	D = git_attr_check(&Q, path);
> 
> where Q contains an array of interned attributes (i.e. questions)
> and other immutable things that is unique to this callsite, but can
> be shared across multiple threads asking the same question from
> here.  As an internal implementation detail, it probably will have a
> mutex to make sure that init will run only once.
> 
> Then the implementation of git_attr_check(&Q, path) would be:
> 
>     - see if there is already the "dynaic part" allocated for the
>       current thread asking the question Q.  If there is not,
>       allocate one and remember it, so that it can be reused in
>       later calls by the same thread; if there is, use that existing
>       one.
> 
>     - reinitialize the "dynamic part" as needed, e.g. clear the
>       equivalent to check_all_attr[], adjust the equivalent to
>       attr_stack for the current path, etc.  Just like the current
>       code optimizes for the case where the entire program (a single
>       thread) will ask the same question for paths in traversal
>       order (i.e. falling in the same directory), this will optimize
>       for the access pattern where each thread asks the same
>       question for paths in its traversal order.
> 
>     - do what the current collect_some_attrs() thing does.
> 
> And this hopefully won't be as costly as the naive and inefficient
> one.

I agree, this sort of implementation wouldn't suffer from the same
allocation penalty that the naive implementation suffers from.  This
would be slightly challenging to ensure that there aren't any memory
leaks, well not leaks but rather memory that isn't freed.  i.e. When a
thread terminates we would want to reclaim the memory used for the
dynamic part which is stored inside the attribute system.

> 
> The reason why I was pushing hard to split the static part and the
> dynamic part in our redesign of the API is primarily because I didn't
> want to update the API callers twice.  But I'd imagine that your v3
> (and your earlier "do not discard attr stack, but keep them around,
> holding their tips in a hashmap for quick reuse") would at least lay
> the foundation for the eventual shape of the API, let's bite the
> bullet and accept that we will need to update the callers again
> anyway.
> 
> Thanks.
> 

At least v3 gets the attribute system to a state where further
improvements should be relatively easy to make.  And now as long as each
thread has a unique attr_check structure, multiple callers can exist
inside the attribute system at the same time.  There is still more work
to be done on it though.  Still my biggest complaint is the "direction"
aspect of the system.  I would love to also eliminate that as global
state at some point though I'm not sure how at this point.

-- 
Brandon Williams