On Mon, Jan 25, 2021 at 9:42 AM Derrick Stolee via GitGitGadget <gitgitgadget@xxxxxxxxx> wrote: > > From: Derrick Stolee <dstolee@xxxxxxxxxxxxx> > > When we have sparse directory entries in the index, we want to compare > that directory against sparse-checkout patterns. Those pattern matching > algorithms are built expecting a file path, not a directory path. This > is especially important in the "cone mode" patterns which will match > files that exist within the "parent directories" as well as the > recursive directory matches. > > If path_matches_pattern_list() is given a directory, we can add a bogus > filename ("-") to the directory and get the same results as before, > assuming we are in cone mode. Since sparse index requires cone mode > patterns, this is an acceptable assumption. Why is "-" a bogus filename? Is that only on certain operating systems, or are you just not expecting a user to name their file with such a bad name? What if there is a file with that name in that directory in the repository; do you need the pathname to be bogus? What do you mean by "get the same results as before"? The first paragraph suggests the code wouldn't handle a directory path, and that not handling it was problematic, so it seems unlikely you want the same results as that. But it's not clear what the "before" refers to here. > Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx> > --- > dir.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/dir.c b/dir.c > index ad6eb033cb1..c786fa98d0e 100644 > --- a/dir.c > +++ b/dir.c > @@ -1384,6 +1384,11 @@ enum pattern_match_result path_matches_pattern_list( > strbuf_addch(&parent_pathname, '/'); > strbuf_add(&parent_pathname, pathname, pathlen); > > + /* Directory requests should be added as if they are a file */ > + if (parent_pathname.len > 1 && > + parent_pathname.buf[parent_pathname.len - 1] == '/') Ah, this looks like a case where the trailing slash is helpful; without it, you might have to feed extra data in through the call hierarchy to signify that this is a directory entry. > + strbuf_add(&parent_pathname, "-", 1); > + > if (hashmap_contains_path(&pl->recursive_hashmap, > &parent_pathname)) { > result = MATCHED_RECURSIVE; hashmap_contains_path? Don't we already know (modulo special cases of our bogus value not quite being bogus enough) that this is false since we were adding a bogus path? How could the hashmap have a bogus value in it? Won't this particular call fail with or without our adding "-" to the end of the path? After this hashmap_contains_path() call, the subsequent code looks for the parent of the path by stripping off everything after the last '/'...which seems like the relevant code anyway. Is the problem that the hashmap_contains_path() call was returning true when we didn't add "-" to the end? If so, can we use and if or a goto instead to make the code skip this first check and move on to where we want it to go? Or am I misunderstanding something about this code?