Re: [BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> If it's not a problem, I'd love to see timings for your case with just
> the first patch, and then with both.

Thanks for the swift response, much appreciated Jeff!

Here are the timings for the two patches:

Patch 1 on top of 33d4221c79

       Elapsed           System              User
 Min.   :6.110   Min.   :0.6700   Min.   :0.3600
 1st Qu.:6.580   1st Qu.:0.6900   1st Qu.:0.3900
 Median :7.260   Median :0.7100   Median :0.4100
 Mean   :7.347   Mean   :0.7248   Mean   :0.4214
 3rd Qu.:8.000   3rd Qu.:0.7400   3rd Qu.:0.4600
 Max.   :8.860   Max.   :0.8700   Max.   :0.5100

I've had to slightly tweak your second patch (`freshened` was never
set) but applying the modified patch yielded even better results
compared to patch 1:

       Elapsed           System              User
 Min.   :0.38   Min.   :0.03000   Min.   :0.2900
 1st Qu.:0.38   1st Qu.:0.04000   1st Qu.:0.3100
 Median :0.39   Median :0.06000   Median :0.3200
 Mean   :0.43   Mean   :0.05667   Mean   :0.3519
 3rd Qu.:0.42   3rd Qu.:0.07000   3rd Qu.:0.3600
 Max.   :0.68   Max.   :0.08000   Max.   :0.5700

This is pretty much back to the "before" state.
The graph really tells the whole story:
https://bytebucket.org/snippets/ssaasen/GeRE/raw/7367353a58c50ccd7c493af40ffb6ba1533e1490/git-merge-timing-patched.png
(After is the change in #33d4221, Before the parent of #33d4221 and so on)
The graph and the NFS stats can be found here:
https://bitbucket.org/snippets/ssaasen/GeRE

My tweaked version of your second patch is:

diff --git a/cache.h b/cache.h
index 51ee856..8982055 100644
--- a/cache.h
+++ b/cache.h
@@ -1168,6 +1168,7 @@ extern struct packed_git {
        int pack_fd;
        unsigned pack_local:1,
                 pack_keep:1,
+               freshened:1,
                 do_not_close:1;
        unsigned char sha1[20];
        /* something like ".git/objects/pack/xxxxx.pack" */
diff --git a/sha1_file.c b/sha1_file.c
index bc6322e..c0ccd4b 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2999,7 +2999,11 @@ static int freshen_loose_object(const unsigned
char *sha1)
 static int freshen_packed_object(const unsigned char *sha1)
 {
        struct pack_entry e;
-       return find_pack_entry(sha1, &e) && freshen_file(e.p->pack_name);
+       if (!find_pack_entry(sha1, &e))
+              return 0;
+       if (e.p->freshened)
+              return 1;
+       return e.p->freshened = freshen_file(e.p->pack_name);
 }

 int write_sha1_file(const void *buf, unsigned long len, const char
*type, unsigned char *returnsha1)



The only change is that I assign the result of `freshen_file` to the
`freshened` flag.

I've only ran this with the test case I was using before but it looks
like this is pretty much fixing the merge time changes we observed.

Thanks again for the swift response. I've got my test setup sitting
here, happy to rerun the tests if that'd be useful.

Is there a chance to backport those changes to the 2.2+ branches?

> You may also be interested in:
>
>   http://thread.gmane.org/gmane.comp.version-control.git/266370
>
> which addresses another performance problem related to the
> freshen/recent code in v2.2.

Thanks for the pointer, I'll have a look at that as well.

Cheers,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]