Re: [GSOC] [QUESTION] ref-filter: can %(raw) implement reuse oi.content?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, Christian and Hariom,

I want to use this patch series as the temporary final version of GSOC project:

https://github.com/adlternative/git/commits/cat-file-reuse-ref-filter-logic

Due to the branch ref-filter-opt-code-logic or branch
ref-filter-opt-perf patch series
temporarily unable to reflect its optimization to git cat-file
--batch. Therefore, using
branch cat-file-reuse-ref-filter-logic is the most effective now.

This is the final performance regression test result:
Test                                        upstream/master   this
tree
------------------------------------------------------------------------------------
1006.2: cat-file --batch-check              0.06(0.06+0.00)
0.08(0.07+0.00) +33.3%
1006.3: cat-file --batch-check with atoms   0.06(0.04+0.01)
0.06(0.06+0.00) +0.0%
1006.4: cat-file --batch                    0.49(0.47+0.02)
0.48(0.47+0.01) -2.0%
1006.5: cat-file --batch with atoms         0.48(0.44+0.03)
0.47(0.46+0.01) -2.1%

git cat-file --batch has a performance improvement of about 2%.
git cat-file --batch-check still has a performance gap of 33.3%.

The performance degradation of git cat-file --batch-check is actually
not very big.

upstream/master (225bc32a98):

$ hyperfine --warmup=10  "~/git/bin-wrappers/git cat-file
--batch-check --batch-all-objects"
Benchmark #1: ~/git/bin-wrappers/git cat-file --batch-check --batch-all-objects
 Time (mean ± σ):     596.2 ms ±   5.7 ms    [User: 563.0 ms, System: 32.5 ms]
 Range (min … max):   586.9 ms … 607.9 ms    10 runs

cat-file-reuse-ref-filter-logic (709a0c5c12):

$ hyperfine --warmup=10  "~/git/bin-wrappers/git cat-file
--batch-check --batch-all-objects"
Benchmark #1: ~/git/bin-wrappers/git cat-file --batch-check --batch-all-objects
 Time (mean ± σ):     601.3 ms ±   5.8 ms    [User: 566.9 ms, System: 33.9 ms]
 Range (min … max):   596.7 ms … 613.3 ms    10 runs

The execution time of git cat-file --batch-check is only a few
milliseconds away.

But look at the execution time changes of git cat-file --batch:

upstream/master (225bc32a98):

$ time ~/git/bin-wrappers/git cat-file --batch --batch-all-objects
>/dev/null
/home/adl/git/bin-wrappers/git cat-file --batch --batch-all-objects >
 24.61s user 0.30s system 99% cpu 24.908 total

cat-file-reuse-ref-filter-logic (709a0c5c12):

$ time ~/git/bin-wrappers/git cat-file --batch --batch-all-objects >/dev/null
cat-file --batch --batch-all-objects > /dev/null  25.10s user 0.30s
system 99% cpu 25.417 total

The execution time has been reduced by nearly 0.5 seconds. Intuition
tells me that the performance improvement of git cat-file --batch will be
more important.

In fact, git cat-file origin code directly adds the obtained object data
to the output buffer; But after using ref-filter logic, it needs to copy
the object data to the intermediate data (atom_value), and finally
to the output buffer. At present, we cannot easily eliminate intermediate
data, because git for-each-ref --sort has a lot of dependence on it,
but we can reduce the overhead of copying or allocating memory as
much as possible.

I had an idea that I didn't implement before: partial data delayed evaluation.
Or to be more specific, waiting until the data is about to be added to
the output
buffer, form specific output content, this may be a way to bypass the
intermediate
data.

To be optimistic, I think this patch can be merged with the current
performance of
git cat-file --batch. Of course, this still needs more suggestions
from reviewers.

Thanks.
--
ZheNing Hu




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux