On 08/12/2015 11:57 PM, David Turner wrote: > Instead of a linear search over common_list to check whether > a path is common, use a trie. The trie search operates on > path prefixes, and handles excludes. > > Signed-off-by: David Turner <dturner@xxxxxxxxxxxxxxxx> > --- > > Probably overkill, but maybe we could later use it for making exclude > or sparse-checkout matching faster (or maybe we have to go all the way > to McNaughton-Yamada for that to be truly worthwhile). Let's take a step back. We have always had a ton of code that uses `git_path()` and friends to convert abstract things into filesystem paths. Let's take the reference-handling code as an example: `git_path("refs/heads/master")` returns something like ".git/refs/heads/master", which happens to be the place where we would store a loose reference with that name. But in reality, "refs/heads/master" is a reference name, not a fragment of a path. It's just that the reference code knows that the transformation done by `git_path()` *accidentally* happens to convert a reference name into the name of the path of the corresponding loose reference file. In fact, the reference code is even smarter than that. It knows that within submodules, `git_path()` does *not* do the right mapping. In those cases it calls `git_path_submodule()` instead. But now we have workspaces, and things have become more complicated. Some references are stored in `$GIT_DIR`, while others are stored in `$GIT_COMMON_DIR`. Who should know all of these details? The current answer is that the reference-handling code remains (mostly) ignorant of workspaces. It just stupidly calls `git_path()` (or `git_path_submodule()`) regardless of the reference name. It is `git_path()` that has grown the global insight to know which files are now stored in `$GIT_COMMON_DIR` vs `$GIT_DIR`. Now it helpfully transforms "refs/heads/master" into "$GIT_COMMON_DIR/refs/heads/master" but transforms "refs/worktree/foo" into "$GIT_DIR/refs/worktree/foo". It has developed similar insight into lots of other file types. IT KNOWS TOO MUCH. And because of that, it become a lot more complicated and might even be a performance problem. This seems crazy to me. It is the *reference* code that should know whether a particular reference should be stored under `$GIT_DIR` or `$GIT_COMMON_DIR`, or indeed whether it should be stored in a database. We should have two *stupid* functions, `git_workspace_path()` and `git_common_path()`, and have the *callers* decide which one to call. The only reason to retain a knows-everything `git_path()` function is as a crutch for 3rd-party applications that think they are clever enough to grub around in `$GIT_DIR` at the filesystem level. But that should be highly discouraged, and we should make it our mission to provide commands that make it unnecessary. Michael -- Michael Haggerty mhagger@xxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html