Re: Object hash (was: Re: [ANNOUNCE] git-rev-size: calculate sizes of repository)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Second, since you call hash_put() once per object, hash->nr cannot grow 
> too big, because grow_hash() doubles hash->alloc. And I call grow_hash() 
> once the hash map is half-full; Somebody once told me that would be the 
> optimal growing strategy.

Optimal growing mainly means to be O(n) (amortized) after n
inserts. That translates to at least _doubling_ (factor 2 or more) the
capacity once you're too full.

Assume doubling at a percentage full. Assume realloc(s) takes O(s)
(where s = number of bytes). Assume we start with 1 element.

We realloc() then when we've got 1 element, then at 2, 4, 8 etc. The
size of the realloc() at each point will also be 1, 2, 4, 8
etc. However, this cost of O(s) can be amortized over the number of
elements. So the work done _per insert_ is still a constant (amortized
again).

Ascilly:

   x x x x x x x x x x ...  (each insert)
     R   R       R     ...  (each realloc)
   1 2 0 4 0 0 0 8 0 0 ...  (cost of those realloc())

This has also to do with the infinite series of the sum(k>0) of 2^-k
being a constant.

-- 
Rutger Nijlunsing ---------------------------------- eludias ed dse.nl
never attribute to a conspiracy which can be explained by incompetence
----------------------------------------------------------------------
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]