On Fri, Oct 14, 2016 at 8:37 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Junio C Hamano <gitster@xxxxxxxxx> writes: > >> *1* Would we need a wrapping struct around the array of results? > > By the way, I do see a merit on the "check" side (tl;dr: but I do > not think "result" needs it, hence I do not see the need for the > "ugly" variants). So we'd rather go with const char **result instead of our own new struct there. Ok, got it. > > Take "archive" for example. For each path, it wants to see the > attribute "export-ignore" to decide if it is to be omitted. In > addition, the usual set of attributes used to smudge blobs into the > working tree representation are inspected by the convert.c API as > part of its implementation of convert_to_working_tree(). This > program has at least two sets of <"check", "result"> that are used > by two git_check_attr() callsites that are unaware of each other. > > One of the optimizations we discussed is to trim down the attr-stack > (which caches the attributes read from .gitattributes files that are > in effect for the "last" directory that has the path for which > attrbiutes are queried for) by reading/keeping only the entries that > affect the attributes the caller is interested in. But when there > are multiple callsites that are interested in different sets of > attributes, we obviously cannot do such an optimization without > taking too much cache-invalidation hit. Because these callsites are > not unaware of each other, I do not think we can say "keep the > entries that affects the union of all active callsites" very easily, > even if it were possible. > > But we could tie this cache to "check", which keeps a constant > subset of attributes that the caller is interested in (i.e. each > callsite would keep its own cache that is useful for its query). > While we are single-threaded, "struct git_attr_check" being a > wrapping struct around the array of "what attributes are of > interest?" is a good place to add that per-check attr-stack cache. > When we go multi-threaded, the attr-stack cache must become > per-thread, and needs to be moved to per-thread storage, and such a > per-thread storage would have multiple attr-stack, one per "check" > instance (i.e. looking up the attr-stack may have to say "who/what > thread am I?" to first go to the thread-local storage for the > current thread, where a table of pointers to attr-stacks is kept and > from there, index into that table to find the attr-stack that > corresponds to the particular "check"). We could use the address of > "check" as the key into this table, but "struct git_attr_check" that > wraps the array gives us another option to allocate a small > consecutive integer every time initl() creates a new "check" and use > it as the index into that attr-stack table, as that integer index > can be in the struct that wraps the array of wanted attributes. > > Note. none of the above is a suggestion to do the attr > caching the way exactly described. The above is primarily > to illustrate how a wrapping struct may give us future > flexibility without affecting a single line of code in the > user of API. > > It may turn out that we do not need to have anything other than the > array of wanted attributes in the "check" struct, but unlike > "result", "check" is shared across threads, and do not have to live > directly on the stack, so we can prepare for flexibility. > > I do not foresee a similar need for wrapping struct for "result", > and given that we do want to keep the option of having them directly > on the stack, I am inclined to say we shouldn't introduce one. > > If we were still to do the wrapping for result, I would say that > basing it around the FLEX_ARRAY idiom, i.e. > >> struct git_attr_result { >> int num_slots; >> const char *value[FLEX_ARRAY]; >> }; > > is a horrible idea. It would be less horrible if it were > > struct git_attr_result { > int num_slots; > const char **value; > }; So const char** but with an additional number of slots, all we do would be to compare this number of slots to the checks number of slots and die("BUG:..."), which is just a burden and no help. > > then make the API user write via a convenience macro something like > this > > const char *result_values[NUM_ATTRS_OF_INTEREST]; > struct git_attr_result result = { > ARRAY_SIZE(result_values), &result_values > }; > > instead. That way, at least the side that implements git_check_attr() > would not have to be type-unsafe like the example of ugliness in the > message I am following-up on. Ok I will reroll with the const char** instead of the macro stuff that I came up with, (that would be type safe though uglier than the pure variant).