Nguyen Thai Ngoc Duy <pclouds@xxxxxxxxx> writes: >> About naming. Where else other than "tree" (in the "hierarchical >> namespace" sense) context do you see pathspec? Does the struct really >> need to be called TREE_pathspec_list? > > Pathspecs are usually stored in a list, const char ** and I don't want > to take the generic name "pathspec_list", unless we convert all to use > this struct. Any suggestions of other names? I was hoping [*1*] that in the longer term we would have a unified machinery to handle pathspecs across ls-tree, ls-files, diff-tree, diff-files and grep. These commands more or less share the same idea of what purpose a pathspec serves, but take quite different codepaths, and the biggest problem is that some know globbing while others don't. The problem arising from the two semantics [*2*] is visible when you run "git add" with pathspec where add_files_to_cache() updates the index only at changed paths (found using diff-files machinery, that implements "diff" family of pathspec semantics to match only directory prefix) and then add_files() adds the paths that are untracked (found using the ls-files machinery, that knows about globbing). Once we have a unified machinery to handle pathspec, the data structure that holds the pathspec should naturally be called "struct pathspec", while an element on that list would be "struct pathspec_elem" or perhaps "struct pathspec_pattern" [*3*]. I would imagine that "struct exclude" (in dir.[ch]) that is contained in "struct exclude_list" might be a good place to start from, in the sense that it shows how a match pattern can be pre-parsed to optimize the matching operation [*4*]. It however may not know about one particular kind of optimization that is essential when dealing with tree objects: the ability to ask "is this subtree worth descending into?". That logic is necessary to avoid opening unnecessary tree objects in diff-tree and grep. [Footnote] *1* https://git.wiki.kernel.org/index.php/SoC2010Ideas#Unify_Pathspec_Semantics *2* As the maintainer, I do consider it also is a problem that each of these semantics has multiple codepaths to implement it, but that is a secondary issue. Having two semantics is visible to the end user and is a bigger problem. *3* I personally tend to consider the whole set as _a_ "path specification" even when you give more than one patterns to match on the command line, but it may be just me. I am Ok with "struct pathspec_set" that holds a set of "struct pathspec", too. Unlike exclude-list where the order of elements in it has a meaning, the matching patterns in a pathspec are unordered, so even if we may end up implementing it as a list, it would be incorrect to call that "struct pathspec_list". *4* "struct exclude_list" is somewhat special in that it is a dynamic data source where you learn more matching rules as you dig deeper. We do not need that aspect of the dir.[ch] codepath for unified pathspec handling. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html