Re: Why Git is so fast

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jakub Narebski <jnareb@xxxxxxxxx> wrote:
> Let's rephrase question a bit then: what low-level operation were needed
> for good performance in JGit? 

Aside from the message I just posted:

- Avoid String, its too expensive most of the time.  Stick with
  byte[], and better, stick with data that is a triplet of (byte[],
  int start, int end) to define a region of data.  Yes, its annoying,
  as its 3 values you need to pass around instead of just 1, but
  its makes a big difference in running time.

- Avoid allocating byte[] for SHA-1s, instead we convert to 5 ints,
  which can be inlined into an object allocation.

- Subclass instead of contain references.  We extend ObjectId to
  attach application data, rather than contain a reference to an
  ObjectId.  Classical Java programming techniques would say this
  is a violation of encapsulatio.  But it gets us the same memory
  impact that C Git gets by saying:

    struct appdata {
      unsigned char[20] sha1;
      ....
	}

- We're hurting dearly for not having more efficient access to the
  pack-*.pack file data.  mmap in Java is crap.  We implement our
  own page buffer, reading in blocks of 8192 bytes at a time and
  holding them in our own cache.

  Really, we should write our own mmap library as an optional JNI
  thing, and tie it into libz so we can efficiently run inflate()
  off the pack data directly.

- We're hurting dearly for not having more efficient access to the
  pack-*.idx files.  Again, with no mmap we read the entire bloody
  index into memory.  But since you won't touch most of it we keep
  it in large byte[], but since you are searching with an ObjectId
  (5 ints) we pay a conversion price on every search step where
  we have to copy from the large byte[] to 5 local variable ints,
  and then compare to the ObjectId.  Its an overhead C git doesn't
  have to deal with.

Anyway.

I'm still just amazed at how well JGit runs given these limitations.
I guess that's Moore's Law for you.  10 years ago, JGit wouldn't
have been practical.

-- 
Shawn.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]