Re: [PATCH 25/27] attr: store attribute stacks in hashmap

Brandon Williams <bmwill@xxxxxxxxxx> · Wed, 18 Jan 2017 12:34:40 -0800

On 01/13, Junio C Hamano wrote:
> Brandon Williams <bmwill@xxxxxxxxxx> writes:
> 
> > The last big hurdle towards a thread-safe API for the attribute system
> > is the reliance on a global attribute stack that is modified during each
> > call into the attribute system.
> >
> > This patch removes this global stack and instead a stack is retrieved or
> > constructed locally.  Since each of these stacks is only used as a
> > read-only structure once constructed, they can be stored in a hashmap
> > and shared between threads.
> 
> Very good.
> 
> The reason why the original code used a stack was because it wanted
> to keep only the info read from releavant files in-core, discarding
> ones from files no-longer relevant (because the traversal switched
> to another subdirectory of the same parent directory), to avoid the
> memory consumption grow unbounded.  It probably was a premature
> "optimization" that we can do without, so keeping everything we have
> read so far in a hashmap (which is my understanding of what is going
> on in this patch) is probably OK.
> 
> I suspect that this hashmap may eventually need to become per
> attr_check if we want to follow through the optimization envisioned
> by patch 15/27.
> 
> Inside fill(), path_matches() is called for the number of match_attr
> in the entire attribute stack but it is wasteful to check if the
> path matches with the a.u.pat if none of the a.state[] entries talk
> about attributes and macros that are eventually get used by the
> caller of check_attr().  By introducing a wrapping structure, 15/27
> wanted to make sure that we have a place to store a "reduced"
> attribute stack that is kept per attr_check that has only entries
> from the files that talk about the attributes the particular
> attr_check wants to learn about.
> 
> I need to think about this a bit more, but I do not offhand think
> that it makes future such enhancement to make it per-check harder to
> move from a global stack to a global hashmap, i.e. the above is not
> an objection to this step.

If we want to continue through and do the optimization you originally
envisioned then I may need to rethink this patch.  One thing we did talk
about offline was doing another check prior to the path_match() function
call which looks through the list of state structs to see if one of
those states would actually have an affect on the array being used to
collect attributes.  Though that may be an optimization which can be
done in addition to creating a reduced stack.

The one difficulty (which you pointed out in comment form) is if we have
a reduced attribute stack that is stored per attr_check then handling
the cleanup when the direction is changed may be messy.

-- 
Brandon Williams