Jeff King <peff@xxxxxxxx> writes: > I think such a loose patch-id could just be a hash of the filenames that > were changed by the patch (e.g., the first 32-bits of the sha1 of the > concatenated filenames). Computing that should be about as expensive as > a tree-diff. Per observation 2 above, if two commits do not have the > same loose id, we know that they cannot possibly have the same strict > id. Because the "strict" one already hashes the filenames, if files that are touched by a patch is different from that of another patch, we judge them being different. > Then we can forget about the smaller-side and bigger-side entirely, and > just do something like: > > 1. Make a sorted list (or hash table) of loose ids for one side. > > 2. For each commit on the other side, calculate its loose id and look > that up in the sorted list. If no hits, we know that there is no > match. For any hits, lazily calculate (and cache) the strict patch > id for both sides and compare as usual. > > In the best case, we compute no patch-ids at all. And even for the > average case, I'd expect our lazy calculation to only have to compute a > handful of ids. Correct. This has rather interesting ramifications on cherry-pick and rebase, though. Both command can handle changes that come from an old tree before some paths were renamed, but strict patch-id would not spot equivalent changes we already have in our history if our change happened after a rename, i.e. Z / O---R---X---Y where Z updates path F, R moves F to G and X changes G the same way as Z changes F, and we are trying to cherry-pick Z on top of Y. The cherry-pick filter will see different patch-id for Z and X. We will likely to notice that "patch already applied" (if using am-3 machinery) or "already up-to-date" (if using merge machinery) even when we missed this equivalency and drop the duplicate from the result, so it is not a big loss, but we might want to consider removing the filename from patch-id computation, at least for the ones we internally use and discard for revs->cherry_pick filtering. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html