Re: [JGIT] Questions about binary right shifting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yann Simon <yann.simon.fr@xxxxxxxxx> wrote:
> I can see in the code that the signed right shifting is used.
> Could it be a problem? Or do we manipulate only positive numbers?

Heh.  We shouldn't ever be dealing with a case where there are more
than 1 billion files in the index, and thus we should never see
the expression "low + high" wrap negative, and then try to divide
by 2 here.

Binary search on a 1 billion file index would hurt, badly.  I'm not
sure what project even has 1 billion source files to checkout
on disk.  Isn't that easily a 4 TB filesystem just for the ext2
inodes of the working tree files?  :-)

Oh, and we can't even have an index with 1 billion entries in it
anyway.  DirCache.readFrom allocates a byte[] of entryCnt * 62.
So our largest number of files permitted is 34,636,833, due to
the limit on byte[] maxing out at 2 GB.  Thus high below cannot
ever be larger than 34 million, and we can't wrap.

This is also true in the Tree class, where we have the entire tree
data stream in a byte[].  Each entry needs at least 23 bytes, but
more if you include some sort of unique file name.  That really puts
a limit on the number of unique names to the point that we cannot
have entries.length so large that the binary search would overflow.

Anyway, if you tie this up into a patch with a message I'll apply,
but I don't care enough to write the message myself given how
unlikely this is.  We'd have to overhaul other code anyway before
we start to run into problems with these search functions.


> diff --git a/org.spearce.jgit/src/org/spearce/jgit/dircache/DirCache.java b/org.spearce.jgit/src/org/spearce/jgit/dircache/DirCache.java
> index 8eb4022..d70eca0 100644
> --- a/org.spearce.jgit/src/org/spearce/jgit/dircache/DirCache.java
> +++ b/org.spearce.jgit/src/org/spearce/jgit/dircache/DirCache.java
> @@ -577,7 +577,7 @@ int findEntry(final byte[] p, final int pLen) {
>  		int low = 0;
>  		int high = entryCnt;
>  		do {
> -			int mid = (low + high) >> 1;
> +			int mid = (low + high) >>> 1;
>  			final int cmp = cmp(p, pLen, sortedEntries[mid]);
>  			if (cmp < 0)
>  				high = mid;
> 
> And could we used right shifting to optimize a division per 2?
> 
> diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/Tree.java b/org.spearce.jgit/src/org/spearce/jgit/lib/Tree.java
> index 0ecd04d..ff9e666 100644
> --- a/org.spearce.jgit/src/org/spearce/jgit/lib/Tree.java
> +++ b/org.spearce.jgit/src/org/spearce/jgit/lib/Tree.java
> @@ -136,7 +136,7 @@ private static final int binarySearch(final TreeEntry[] entries,
>  		int high = entries.length;
>  		int low = 0;
>  		do {
> -			final int mid = (low + high) / 2;
> +			final int mid = (low + high) >>> 1;
>  			final int cmp = compareNames(entries[mid].getNameUTF8(), nameUTF8,
>  					nameStart, nameEnd, TreeEntry.lastChar(entries[mid]), nameUTF8last);
>  			if (cmp < 0)
> 
> Yann

-- 
Shawn.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux