Re attr API further revamp

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I hate to be doing this, but we need yet another revamp to the attr
API that affects all the callers.

In the original design, a codepath that wants to check attributes
repeatedly for many paths (e.g. "convert" that wants to see what
crlf, ident, filter, eol and text attributes are set to for paths
that it is asked to munge the blob contents for) used to allocate

	static struct git_attr_check {
        	struct git_attr *attr;
                const char *value;
	} ccheck[NUM_ATTRS];

and populated the array when the first time the codepath was
entered, e.g.

	if (!ccheck[0].attr) {
        	for (i = 0; i < NUM_ATTRS; i++)
                	ccheck[i].attr = git_attr(...);
	}

and then make a call to the API to ask for these attributes in
ccheck[] for a path, i.e.

	git_attr_check(path, NUM_ATTRS, ccheck);

The API function will fill in the ccheck[].value fields and the
caller will read from there how each attribute is set for the path.

This was updated with recent jc/attr topic so that attr_check
structure can be enriched to keep _more_ state, such as the
pre-parsed representation of all the contents of the relevant
.gitattributes files, which currently is held in a program-side
singleton instance called attr_stack, partly in preparation for
Stefan's sb/pathspec-label topic that is expected to place a lot
heavier load to the attribute subsystem.

A caller of the API after that update would do more like

  	static struct git_attr_check *ccheck;

	if (!ccheck)
        	ccheck = git_attr_check_initl(...);
	git_attr_check(path, ccheck);
	for (i = 0; i < NUM_ATTRS; i++)
        	if (ccheck->check[i].value == ATTR_UNSET)
			... do the Unset thing ...

This however is not thread-safe for obvious two reasons.

 * Two threads can simultanously enter this section of the code and 
   end up initializing ccheck twice;

 * Even though ccheck[].attr are constant for the purpose of this
   codepath (i.e. all threads passing through are interested in
   checking the same set of attributes), ccheck[].value fields
   cannot be shared across simultaneous threads like the above
   snippet does.

So it appears that the final API visible from the callers needs to
be updated further, perhaps something like:


	static struct git_attr_check *ccheck;
	const char *values[NUM_ATTRS];

	git_attr_check_initl(&ccheck, "crlf", "ident", ..., NULL);
	git_attr_check(path, ccheck, values);

where git_attr_check_initl() would do the "if ccheck is NULL then
initialize it to an instance" atomically in threading environment,
and git_attr_check() returns its finding to values[] array the
calling thread exclusively owns, while sharing the input ccheck
and the list of attributes the call inquires among threads.

Also unlike the earlier plan, attr_stack aka "where in the directory
hierarchy are we asking attributes for?" will not be stored directly
in git_attr_check structure.  It is likely that a thread-local
structure will be introduced hidden behind this API (i.e. the
callers do not have to be aware of it), and attr_stack will be
registered in this thread-local structure keyed by &ccheck, so that
<thread, callsite> pair can have attr_stack instance of its own.

This is because a single attr_stack per ccheck will not work as an
optimization mechanism when multi-threaded.  Two threads may be
running the same convert() function, and they may be interested in
the same set of attributes (e.g. "crlf", "ident", etc.) but they
will be working on different parts of the tree, so having attr_stack
per <thread, callsite> would make more sense.

All of which will be quite a lot of work, though, so do not expect
that it will appear by the end of this week ;-)

Thanks.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]