On Tue, Feb 9, 2021 at 9:03 AM Junio C Hamano <gitster@xxxxxxxxx> wrote: > > Derrick Stolee <stolee@xxxxxxxxx> writes: > > >> +Note that when rename detection is on but both copy and break > >> +detection are off, rename detection adds a preliminary step that first > >> +checks files with the same basename. If files with the same basename > > > > I find myself wanting a definition of 'basename' here, but perhaps I'm > > just being pedantic. A quick search clarifies this as a standard term [1] > > of which I was just ignorant. > > > > [1] https://man7.org/linux/man-pages/man3/basename.3.html > > > >> +are sufficiently similar, it will mark them as renames and exclude > >> +them from the later quadratic step (the one that pairwise compares all > >> +unmatched files to find the "best" matches, determined by the highest > >> +content similarity). > > While I do not think `basename` is unacceptably bad, we should aim > to do better. For "direc/tory/hello.txt", both "hello.txt" or > "hello" are what would come up to people's mind with the technical > term "basename" (i.e. basename as opposed to dirname, vs basename as > opposed to filename with .extension). > > Avoiding this ambiguity and using a word understandable by those not > versed well with UNIX/POSIX lingo may be done at the same time, > hopefully. > > For example, can we frame the description around this key sentence: > > The heuristics is based on an observation that a file is often > moved across directories while keeping its filename the same. > > The term "filename" alone can be ambiguous (i.e. both "hello.txt" > and "direc/tory/hello.txt" are valid interpretations in the earlier > example), but in the context of a sentence that talks about "moved > across directories", the former would become the only valid one. We > can even say just "name" and there is no ambiguity in the above "key > sentence". > > Then keeping that in mind, we can rewrite the above you quoted like > so without going technical and without risking ambiguity, like this: > > ... a preliminary step that checks if files are moved across > directories while keeping their filenames the same. If there is > a file added to a directory whose contents is sufficiently > similar to a file with the same name that got deleted from a > different directory, ... Nice, I like it!