Re: [PULL REQUEST] Please pull rdma.git

Doug Ledford <dledford@xxxxxxxxxx> · Wed, 19 Jul 2017 18:05:08 -0400

On 7/19/2017 4:40 PM, Bart Van Assche wrote:
> On Wed, 2017-07-19 at 13:54 -0400, Doug Ledford wrote:
>> On Tue, 2017-07-18 at 21:42 -0600, Robert LeBlanc wrote:
>>> On Tue, Jul 18, 2017 at 1:26 PM, Linus Torvalds
>>> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>>>> On Tue, Jul 18, 2017 at 12:07 PM, Doug Ledford <dledford@xxxxxxxxxx
>>>>> wrote:
>>>>>
>>>>>
>>>
>>> I'm trying to understand why merges are being done instead of
>>> rebases.
>>> Since we don't want to include other people's work, it seems that it
>>> is cleaner to do a rebase. This is more for my education with using
>>> Git with such a large project rather than me suggesting something
>>> useful. (I dropped Linus from this part of the thread so as not to
>>> bother him with an off-topic conversation).
>>
>> Rebases change the history of a patch.  If I commit a patch on July
>> 7th, and then rebase on July 20th, the patch gets rewritten with the
>> new date.  In addition, they get new commit hashes.  So if someone
>> pulls my tree on the 9th, sees the commit hash for their patch, and
>> then references it in an email or a bug report, then I rebase on the
>> 20th, the old commit hash is gone and will be replaced with a new one. 
>> Finally, if someone pulls my tree on the 8th, and then again on the
>> 22nd, and they don't know I've rebased it, the pull will attempt to put
>> all of the new hashes on top of the old hashes for the same commits. 
>> It creates a ton of merge work that is error prone.  Sometimes chunks
>> get added twice, stuff like that.
>>
>> There are a few things you can do to get around this, and I sometimes
>> use those tricks.  I've declared on-list that my github repo is subject
>> to being rebased at any time, so people know this.  I also have my
>> github repo as the source for my 0day testing.  So, I can push to
>> github, wait for 0day test results, and if there was a problem, I can
>> fix it using a rebase of whatever patch was broken, repush to github,
>> and repeat until 0day testing passes, and only then do I push to my
>> kernel.org repo, which is taken to be involate and rebases are
>> forbidden.  But even there, if I *really* have to, I can rebase by
>> deleting the branch I originally created and creating a new branch with
>> the rebase on it under a different name.  That prevents someone from
>> accidentally pulling the rebase on top of a previous pull.  But I
>> *really* try to avoid that.
> 
> Hello Doug,
> 
> Rebases do not only change the commit ID of a patch but can also change the
> patch itself. If e.g. a patch (a) that was posted on the linux-rdma mailing
> list contains two changes and a patch (b) contains only one of two changes,
> if the rdma tree gets rebased on top of a tree that has patch (b) then the
> rebase will modify patch (a) such that only the changes that were not in
> patch (b) remain. git will neither complain about this nor report it.

This is no different than if you run git am on patch (b) and then patch
(a) on the same branch, except the change will be lost in the "conflicts
existed when applying the patch, please apply and resolve then commit
the result".  In this case patch (a) will loose the exact same data that
git dropped out during a rebase.

> Since
> a rebase can modify a patch it also invalidates any testing and reviews for
> that patch.

The same can be said of the situation I listed above.  It's up to my
discretion to determine if the change is significant enough to warrant
going back and getting new reviews.  If the issue is that there was a
typo in a patch that caused a build error, I'm not going to invalidate
reviews for that.

> This why Linus hates rebases and why nowadays Linus refuses to
> accept any pull requests of trees that have been rebased. I hope I
> misunderstood you but if you are routinely rebasing branches with patches
> that come from the linux-rdma mailing list please stop doing this.

Routinely?  No.  What I really use it for is when I'm taking a stack of
patches into either their own branch or directly onto my branch, if I
run into problems in the build test phase, then I'll rebase to fix the
build issues.  But that's limited to the stack I just took, so it all
happens as part of the "review, integrate, build, test" cycle for that
day's work.  By the time I leave for the day and push it to k.o (or at
worst come in the next morning and check 0day status and then either
fixup or push to k.o), it won't be rebased any more.  And if I split my
day's work up into different stacks, then once one stack is pushed to
k.o, that stack won't be rebased even if later stacks that same day need
rebasing to build/work properly.

> A model that some other kernel maintainers (e.g. Jens Axboe) follow is as
> follows:
> * Maintain one branch per pull request that will be sent to Linus, e.g.
>   for-4.13/block. Never rebase this branch, never rewrite its history and
>   only merge Linus' branch into this branch if absolutely necessary.
> * Every time a patch has to be applied, use "git am" to apply it to the
>   appropriate branch. Complain on the mailing list if "git am" complains.
>   Add the maintainer Signed-off-by and edit the patch if this is considered
>   necessary.
> * Maintain a for-next branch that is the result of merging all branches that
>   will be sent to Linus. Resolve any merge conflicts if necessary. Ensure
>   that this branch is included in Steven Rostedt's linux-next tree and that
>   it gets tested by the zero-day testing infrastructure.

This is fairly similar to what I do.  I really only use rebases as part
of my internal fixups for build issues in a given day's/stack's
integration, which, BTW, is in line with what Linus says in the email
you linked to.

-- 
Doug Ledford <dledford@xxxxxxxxxx>
    GPG Key ID: B826A3330E572FDD
    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

Attachment:
signature.asc

Description: OpenPGP digital signature