[JGIT PATCH 00/18] Misc. performance tweaks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Cloning linux-2.6.git through JGit was painful at best.  I found
and fixed some small bottlenecks after a day of profiling and
experimentation, but we're still slower than C git.

With this series I managed to drop the time for "git clone --bare"
over git:// using "jgit daemon" server and "C git" client.
Any difference between jgit and "C git" is in the server side.

  before:  7m42.488s
  after :  2m33.882s
  C git :  1m26.158s     ("git daemon" server)


So I'm still seeing a major bottleneck that I can't quite fix.

Object enumeration (aka "Counting ...") takes too long, because we
spend a huge amount of time unpacking delta chains for trees so we
can enumerate their referenced items.

Our UnpackedObjectCache gets <4% hit ratio when doing the trees
for linux-2.6.git.  Increasing the cache doesn't have a noticable
improvement on performance.

I tried rewriting UnpackedObjectCache to permit multiple objects
per hash bucket.  Even with that (and the maximum chain length
per bucket not exceeding 4 items) our hit ratio was still <5%,
so I tossed that implementation out.

"jgit rev-list --objects" vs. "git rev-list --objects" is a huge
difference, about 1m difference.  That's most of the time difference
I noted above between jgit and C git on the server side.

So with this series, we're better.  Its actually almost tolerable
to clone linux-2.6 through a jgit backed server.


Shawn O. Pearce (23):
  Improve hit performance on the UnpackedObjectCache
  Add MutableObjectId.clear() to set the id to zeroId
  Allow TreeWalk callers to pass a MutableObjectId to get the current
    id
  Switch ObjectWalk to use the new MutableObjectId form in TreeWalk
  Change walker based fetch to use TreeWalk's MutableObjectId accessor
  Reduce garbage allocation when using TreeWalk
  Switch ObjectWalk to use a naked CanonicalTreeParser because its
    faster
  Remove the unused PackFile.get(ObjectId) form
  Remove getId from ObjectLoader API as its unnecessary overhead
  Make mmap mode more reliable by forcing GC at the correct spot
  Rewrite WindowCache to use a hash table
  Change ByteArrayWindow to read content outside of WindowCache's lock
  Dispose of RevCommit buffers when they aren't used in PackWriter
  Don't unpack delta chains while writing a pack from a pack v1 index
  Don't unpack delta chains while converting delta to whole object
  Defer parsing of the ObjectId while walking a PackIndex Iterator
  Only do one getCachedBytes per whole object written
  Correctly use a long for the offsets within a generated pack
  Allow more direct access to determine isWritten
  Move "wantWrite" field of ObjectToPack into the flags field
  Use an ArrayList for the reuseLoader collection in PackWriter
  Don't cut off existing delta chains if we are reusing deltas
  Correctly honor the thin parameter to PackWriter.writePack

 .../jgit/pgm/opt/AbstractTreeIteratorHandler.java  |    6 +-
 .../tst/org/spearce/jgit/lib/PackIndexTest.java    |    4 +-
 .../tst/org/spearce/jgit/lib/PackWriterTest.java   |   14 +-
 .../tst/org/spearce/jgit/lib/T0004_PackReader.java |    4 +-
 .../jgit/errors/CorruptObjectException.java        |   12 +
 .../src/org/spearce/jgit/lib/ByteArrayWindow.java  |   31 ++
 .../src/org/spearce/jgit/lib/ByteBufferWindow.java |   17 +
 .../src/org/spearce/jgit/lib/ByteWindow.java       |   20 ++-
 .../src/org/spearce/jgit/lib/Constants.java        |    2 +-
 .../spearce/jgit/lib/DeltaPackedObjectLoader.java  |    3 +-
 .../src/org/spearce/jgit/lib/MutableObjectId.java  |    9 +
 .../src/org/spearce/jgit/lib/ObjectLoader.java     |   38 ---
 .../src/org/spearce/jgit/lib/PackFile.java         |   49 +---
 .../src/org/spearce/jgit/lib/PackIndex.java        |   48 ++--
 .../src/org/spearce/jgit/lib/PackIndexV1.java      |   20 +-
 .../src/org/spearce/jgit/lib/PackIndexV2.java      |   27 +-
 .../src/org/spearce/jgit/lib/PackWriter.java       |   63 +++--
 .../src/org/spearce/jgit/lib/Repository.java       |   29 +--
 .../org/spearce/jgit/lib/UnpackedObjectCache.java  |   21 +-
 .../org/spearce/jgit/lib/UnpackedObjectLoader.java |   12 +-
 .../spearce/jgit/lib/WholePackedObjectLoader.java  |    3 +-
 .../src/org/spearce/jgit/lib/WindowCache.java      |  323 ++++++++++++--------
 .../src/org/spearce/jgit/lib/WindowCursor.java     |   16 +
 .../src/org/spearce/jgit/lib/WindowedFile.java     |   61 +++--
 .../src/org/spearce/jgit/revwalk/ObjectWalk.java   |   51 ++--
 .../src/org/spearce/jgit/revwalk/RevWalk.java      |    8 +-
 .../spearce/jgit/transport/PackedObjectInfo.java   |    2 +-
 .../src/org/spearce/jgit/transport/UploadPack.java |    1 +
 .../jgit/transport/WalkFetchConnection.java        |   48 ++-
 .../jgit/treewalk/AbstractTreeIterator.java        |   48 +++
 .../spearce/jgit/treewalk/CanonicalTreeParser.java |   85 +++++-
 .../src/org/spearce/jgit/treewalk/TreeWalk.java    |   88 +++++-
 .../spearce/jgit/util/CountingOutputStream.java    |    5 +-
 33 files changed, 752 insertions(+), 416 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux