[PATCH v2 0/2] git-p4: speed up search for branch parent

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In this iteration, I have added more context and measurements to the commit
message.

I have also made small improvements to the code suggested by reviewers.

I enhanced t9801-git-p4-branch.sh to test for the functionality, namely that
branches are branched off at the correct point in their parents' history.

Signed-off-by: Joachim Kuebart joachim.kuebart@xxxxxxxxx

cc: Joachim Kuebart joachim.kuebart@xxxxxxxxx

Joachim Kuebart (2):
  git-p4: ensure complex branches are cloned correctly
  git-p4: speed up search for branch parent

 git-p4.py                | 21 ++++++++++-----------
 t/t9801-git-p4-branch.sh |  2 ++
 2 files changed, 12 insertions(+), 11 deletions(-)


base-commit: 311531c9de557d25ac087c1637818bd2aad6eb3a
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1013%2Fjkuebart%2Fp4-faster-parent-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1013/jkuebart/p4-faster-parent-v2
Pull-Request: https://github.com/git/git/pull/1013

Range-diff vs v1:

 -:  ------------ > 1:  0ee0b7b55691 git-p4: ensure complex branches are cloned correctly
 1:  a171f7e6c023 ! 2:  41b3a23f682c git-p4: speed up search for branch parent
     @@ Metadata
       ## Commit message ##
          git-p4: speed up search for branch parent
      
     -    Previously, the code iterated through the parent branch commits and
     -    compared each one to the target tree using diff-tree.
     +    For every new branch that git-p4 imports, it needs to find the commit
     +    where it branched off its parent branch. While p4 doesn't record this
     +    information explicitly, the first changelist on a branch is usually an
     +    identical copy of the parent branch.
      
     -    This patch outputs the revision's tree hash along with the commit hash,
     -    thereby saving the diff-tree invocation. This results in a considerable
     -    speed-up, at least on Windows.
     +    The method searchParent() tries to find a commit in the history of the
     +    given "parent" branch whose tree exactly matches the initial changelist
     +    of the new branch, "target". The code iterates through the parent
     +    commits and compares each of them to this initial changelist using
     +    diff-tree.
     +
     +    Since we already know the tree object name we are looking for, spawning
     +    diff-tree for each commit is wasteful.
     +
     +    Use the "--format" option of "rev-list" to find out the tree object name
     +    of each commit in the history, and find the tree whose name is exactly
     +    the same as the tree of the target commit to optimize this.
     +
     +    This results in a considerable speed-up, at least on Windows. On one
     +    Windows machine with a fairly large repository of about 16000 commits in
     +    the parent branch, the current code takes over 7 minutes, while the new
     +    code only takes just over 10 seconds for the same changelist:
     +
     +    Before:
     +
     +        $ time git p4 sync
     +        Importing from/into multiple branches
     +        Depot paths: //depot
     +        Importing revision 31274 (100.0%)
     +        Updated branches: b1
     +
     +        real    7m41.458s
     +        user    0m0.000s
     +        sys     0m0.077s
     +
     +    After:
     +
     +        $ time git p4 sync
     +        Importing from/into multiple branches
     +        Depot paths: //depot
     +        Importing revision 31274 (100.0%)
     +        Updated branches: b1
     +
     +        real    0m10.235s
     +        user    0m0.000s
     +        sys     0m0.062s
      
          Signed-off-by: Joachim Kuebart <joachim.kuebart@xxxxxxxxx>
     +    Helped-by: Junio C Hamano <gitster@xxxxxxxxx>
     +    Helped-by: Luke Diamand <luke@xxxxxxxxxxx>
      
       ## git-p4.py ##
      @@ git-p4.py: def importNewBranch(self, branch, maxChange):
     @@ git-p4.py: def importNewBranch(self, branch, maxChange):
           def searchParent(self, parent, branch, target):
      -        parentFound = False
      -        for blob in read_pipe_lines(["git", "rev-list", "--reverse",
     -+        for tree in read_pipe_lines(["git", "rev-parse",
     -+                                     "{}^{{tree}}".format(target)]):
     -+            targetTree = tree.strip()
     -+        for blob in read_pipe_lines(["git", "rev-list", "--format=%H %T",
     ++        targetTree = read_pipe(["git", "rev-parse",
     ++                                "{}^{{tree}}".format(target)]).strip()
     ++        for line in read_pipe_lines(["git", "rev-list", "--format=%H %T",
                                            "--no-merges", parent]):
      -            blob = blob.strip()
      -            if len(read_pipe(["git", "diff-tree", blob, target])) == 0:
      -                parentFound = True
     -+            if blob[:7] == "commit ":
     ++            if line.startswith("commit "):
      +                continue
     -+            blob = blob.strip().split(" ")
     -+            if blob[1] == targetTree:
     ++            commit, tree = line.strip().split(" ")
     ++            if tree == targetTree:
                       if self.verbose:
      -                    print("Found parent of %s in commit %s" % (branch, blob))
      -                break
     @@ git-p4.py: def importNewBranch(self, branch, maxChange):
      -            return blob
      -        else:
      -            return None
     -+                    print("Found parent of %s in commit %s" % (branch, blob[0]))
     -+                return blob[0]
     ++                    print("Found parent of %s in commit %s" % (branch, commit))
     ++                return commit
      +        return None
       
           def importChanges(self, changes, origin_revision=0):

-- 
gitgitgadget



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux