Re: [PATCH 9/9] clone: run check_everything_connected

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 28, 2013 at 07:40:51AM +0700, Duy Nguyen wrote:
> Maybe we could do it in index-pack to save some (wall) time. I haven't
> tried but I think it might work. The problem is to make sure the pack
> contains objects for all sha1 references in the pack. By that
> description, we don't need to do standard DAG traversal. We could
> extract sha-1 references in index-pack as we uncompress objects and
> put all "want" sha-1 in a hash table. At the end of index-pack, we
> check if any sha-1 in the hash table still points to non-existing
> object.
> 
> This way, at least we don't need to uncompress all objects again in
> rev-list. We could parse+hash in both phases in index-pack. The first
> phase (parse_pack_objects) is usually I/O bound, we could hide some
> cost there. The second phase is multithreaded, all the better.

It looks like what I describe above is exactly what index-pack
--strict does. Except that it holds the lock longer and has more
abstraction layers to slow things down. On linux-2.6 with 3 threads:

$ rev-list --all --objects --quiet (aka check_everything_connected)
34.26user 0.22system 0:34.56elapsed 99%CPU (0avgtext+0avgdata 2550528maxresident)k
0inputs+0outputs (0major+208569minor)pagefaults 0swaps

$ index-pack --stdin
214.57user 8.38system 1:31.82elapsed 242%CPU (0avgtext+0avgdata 1357328maxresident)k
8inputs+1421016outputs (0major+1222537minor)pagefaults 0swaps

$ index-pack --stdin --strict
297.36user 13.77system 2:11.82elapsed 236%CPU (0avgtext+0avgdata 1875040maxresident)k
0inputs+1421016outputs (0major+1308718minor)pagefaults 0swaps

$ index-pack --stdin --connectivity
231.09user 7.42system 1:37.39elapsed 244%CPU (0avgtext+0avgdata 2080816maxresident)k
0inputs+1421016outputs (0major+540069minor)pagefaults 0swaps

The last one does not hold locks by duplicating object hash table per
thread. As you can see the consumed memory is much higher than --stdin.
In return it only adds up 1/3 of rev-list time.

Maybe you should check which one is cheaper for clone case,
check_everything_connected() or index-pack --strict.
--
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]