Hi Eric, On Tue, Feb 23, 2016 at 01:56:07PM -0600, Eric W. Biederman wrote: > > Fengguag Wu, Xiaolong Ye, have you attempted to use the truncated > sha1 of the file the patch applies to? Git already places a file sha1 > at the top of a patch. See the index line? > > > diff --git a/fs/namespace.c b/fs/namespace.c > > index eccd925c6e82..3c3f8172c734 100644 Yes we've evaluated to make use of that index. The conclusion is, it helps make a better guess, however it's still a guessing work and far from perfect. A simple accounting shows only 1/5 files will be changed between two major kernel releases: wfg /c/linux% git ls-files |wc -l 52915 wfg /c/linux% git diff --name-only v4.3 v4.4|wc -l 10606 That means a huge number candidate base tree IDs matching the given blob IDs. > > --- a/fs/namespace.c > > +++ b/fs/namespace.c > > As I understand it you are aiming for making a good guess what the patch > or patches apply to, having a set of file hashes looks like it would > give you that. > > All it should take is to iterate over a patchset and for each file in > the patchset capture the first file hash. Then in the smallish set of > maintainer trees see if that set of file hashes matches any of their > recent commits. You should be able to prune the set of possible > maintainer trees even more by looking at the mailling list or lists > the patch was submitted to. We actually start with the above thinking half year ago. Yes it'll help narrow down the list of candidate maintainer trees. And the chance will be increased if the patchset modifies multiple files, and the fact some files are modified more frequently than the others. However it's still fundamentally a guess work. The best choice is to ask for explicit "base tree ID". > Before we talk about adding anything more I think we need a clear > picture of what you have tried with what already exists. A decade ago > part of the problem was that not everyone used git. At best it will > take a little while before everyone upgrades to a version of git diff > containing your changes, and if possibly even longer if they have to > start specifying an additional option when a diff is generated. That's a good concern. It may take year long delay before reaching reasonable population of the new feature. To speedup the process, we could advocate the new git option in 0day robot's error reports. Since we catch errors in ~10 LKML patches each day, within months most kernel developers should get the tips on how to set it up and enable the feature by default. Thanks, Fengguang -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html