Re: [StGit PATCH 03/14] Write to a stack log when stack is modified

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



2008/6/18 Karl Hasselström <kha@xxxxxxxxxxx>:
> On 2008-06-18 17:16:10 +0100, Catalin Marinas wrote:
>> We could also easily use the commit message directly for metadata
>> instead of .git/patches/<branch>/.
>
> Yes. This is true for any log scheme, though. (But I agree -- longer
> term, it would be very nice to have _only_ the log.)

It would make collaboration using stgit stacks easier. I can start
looking at this once we agree on the log structure.

>> The time to create that tree and blobs worries me a bit, plus the
>> (in my view) complicated structure.
>
> You're right to worry. My log format makes things feel slightly slower
> when logging a 20-30 deep stack. If I can't make it faster, it's not
> viable. But I'm pretty sure I can -- it should be simple to reuse all
> the trees and blobs for the patches that weren't touched. (And
> operations that touch lots of patches, like rebase, have to write a
> lot of git objects anyway as part of their operation, so the relative
> slowdown should not be large.)
>
> Anyway, benchmarking is the way to go here. Talk will only get us that
> far.

In the past, I ran some tests because people complained that stgit was
slow compared to quilt:

http://article.gmane.org/gmane.comp.version-control.git/9670

After profiling (the stg-prof file), as expected, most of the time was
spent in external Git calls which I tried to keep to a minimum. With
your approach, you probably add at least 4-5 calls to Git (just a
guess, I haven't counted) and, with a big repository, it will be
visible (I have about 15 branches on my Linux kernel clone going back
to the 2.6.12 kernel and the .git size is over 1/2GB after git-gc).

>> Making the first log entry special gets difficult with log pruning
>> (unless you prune the whole log rather than entries older than a
>> chosen time or number) since you might have to re-create all the
>> chained log entries as the first log's sha1 will probably change.
>
> You have to re-create all the commits anyway, since they all are
> immutable, and all have a back pointer.

Ah, OK. So, at least initially, we should only support the full log pruning.

>> The applied patches are chained automatically via HEAD. For
>> unapplied patches, we could add the correponding commits as parents
>> of the logging commit (starting with the third parent as the first
>> two are used for log chaining and applied patches). Do we hit any OS
>> limit with the number of arguments?
>
> Not until long after we hit git limits to the number of parents of a
> commit. I believe the octopus merge refuses to create merges with more
> than about 25 parents, and we probably shouldn't do more than that
> either. We'll have to do a tree of octopuses.

For the first log only, we could chain the unapplied patches using
commits with 2 parents. We just need to warn people not to stare at
the <branch>.stgit directly :-)

>> Since we only need the unapplied commits tracking for gc and pull
>> reasons, we could only create that commit that references them when
>> we prune the stack log and link it from the top one (but it won't be
>> used by stgit).
>
> Yes, we need to create an "unapplied octopus" if and only if we have
> unapplied patches that we can't prove are reachable from the stack top
> or the branch head (we have to save both, in case the user has done
> something such as git-committing on topp of the stack and caused head
> != top). Which is for the first log entry, and in situations such as
> "stg pick --unapplied", but not for "stg pop" and the like.

Yes, but just chain unapplied commits rather than using octopus (you
could try to the -mm series with 500+ patches and see what happens).

> I do agree that we shouldn't try to use the octopuses to get hold of
> the commits, though -- just to keep them reachable. We save the sha1
> along with the patch name elsewhere in a more convenient form. (My
> proposed format does precisely this.)

Yes, but you still have to refer the tree objects corresponding to a
patch and extra work to generate the simplified log.

> So. If I got it right, your proposal is:
>
>  * Tree: just take the HEAD tree.

Yes.

>  * Commit message: list the applied and unapplied patches with their
>    commit sha1s.

We don't even need to differentiate between applied and unapplied in
the commit message as long as they are listed in order since one of
the parents would represent the boundary between them. Since lib.stack
is pretty well structured, we could later modify PatchOrder to use the
log.

>  * Parents: the previous log entry; branch head; something that
>    (recusively) points to all unapplied commits, if necessary.

As you pointed below, "branch head" should probably be the "stack
top". We don't need to track the "branch head" if different, just need
to fix up the error and add the patches to the stack. And, anyway, if
one modifies the HEAD using git directly, the log will still point to
the top of the stack.

The third head would only be needed for the first log entry or when we
use pick --unapplied (in the latter, it only points to the unapplied
commit). The third head shouldn't be used at all in the log, just
created when needed.

> I'd add to that:
>
>  * The stack top must be a parent too if head != top.

See above, branch head would actually be the stack top.

>  * The commit message must include a version number, and the branch
>    head sha1.

OK with the version number but the branch head (or stack top) is one
of the parents already.

>  * I'm pretty sure we want the kind of "simplified" log I have in my
>    proposal. The full log in your proposal is going to look every bit
>    as ugly as the one in mine.

I agree it will look ugly but the simplified log adds an extra
overhead on any stgit action. If we don't use stg log -g, a text only
log command could show the diff. We can add it afterwards though if it
is fast enough.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux