Re: [PATCH] git-p4: speed up search for branch parent

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 29/04/2021 07:48, Joachim Kuebart wrote:
On Thu, 29 Apr 2021 at 04:22, Junio C Hamano <gitster@xxxxxxxxx> wrote:

"Joachim Kuebart via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:

From: Joachim Kuebart <joachim.kuebart@xxxxxxxxx>

Thanks.  As git-p4 is not in my area of expertise, I'll make a style
critique first, while pinging Luke as an area expert (you can learn
who they are with "git shortlog --no-merges --since=18.months.ago
git-p4.py").

Hi Junio, thanks for your timely and thorough review and for putting
up with my greenhorn mistakes ;-)

Previously, the code iterated through the parent branch commits and
compared each one to the target tree using diff-tree.

It is customary in this project to describe the problem in the
present tense.  In other words, whoever is writing the log message
still lives in the world without this patch applied to the system.

I will rephrase the commit message and give better details as you
mentioned. Thanks a lot for your suggestions!

When making a claim on performance, it is helpful to our readers to
give some numbers, even in a limited test, e.g.

     In a sample history where ~100 commits needed to be traversed to
     find the fork point on my Windows box, the current code took
     10.4 seconds to complete, while the new code yields the same
     result in 1.8 seconds, which is a significant speed-up.

or something along these lines.

I will add some measurements.

Signed-off-by: Joachim Kuebart <joachim.kuebart@xxxxxxxxx>
  git-p4.py | 22 +++++++++++-----------
  1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index 09c9e93ac401..dbe94e6fb83b 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -3600,19 +3600,19 @@ def importNewBranch(self, branch, maxChange):
          return True

      def searchParent(self, parent, branch, target):
+        for tree in read_pipe_lines(["git", "rev-parse",
+                                     "{}^{{tree}}".format(target)]):
+            targetTree = tree.strip()

It looks very strange to run a commit that you expect a single line
of output, and read the result in a loop.  Doesn't git-p4.py supply
a more suitable helper to read a single line output from a command?

You're absolutely right that this isn't very readable. I had a quick
look around for a function that reads a single-line response, but I'll
look again and come up with a clearer solution.

I don't think there is one - git-p4 has lots of functions for calling `p4', but for calling git, it just uses Python's Popen() API.

A good question is whether we can start taking advantage of the newer features in Python3 which will obviously break backward compatibility.


+        for blob in read_pipe_lines(["git", "rev-list", "--format=%H %T",
                                       "--no-merges", parent]):

This is not a new problem you introduced, but when we hear word
"blob" in the context of this project, it reminds us of the "blob"
object, while the 'blob' variable used in this loop has nothing to
do with it.  Perhaps rename it to say 'line' or something? >
Will do, thanks!

It confused me as well.


+            if blob[:7] == "commit ":
+                continue

Perhaps blob.startswith("commit ") to avoid hardcoded string length?

Yes, that's the name of the function that I can never think of when I need it.

Thanks again for your comments,

Joachim


There are existing tests for importing branches which should cover this. I don't know if they need to be extended or not, you might want to check.

Looks good otherwise.





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux