Re: [PATCH 6/7] pack-objects: allow for early delta deflating

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nicolas Pitre wrote:
When the delta data is cached in memory until it is written to a pack
file on disk, it is best to compress it right away in find_deltas() for
the following reasons:

  - we have to compress that data anyway;

  - this allows for caching more deltas with the same cache size limit;

  - compression is potentially threaded.

This last point is especially relevant for SMP run time.  For example,
repacking the Linux repo on a quad core processor using 4 threads with
all default settings produce the following results before this change:

	real    2m27.929s
	user    4m36.492s
	sys     0m3.091s

And with this change applied:

	real    2m13.787s
	user    4m37.486s
	sys     0m3.159s

So the actual execution time stayed more or less the same but the
wall clock time is shorter.

This is however not a good thing to do when generating a pack for
network transmission.  In that case, the network is most likely to
throttle the data throughput, so it is best to make find_deltas()
faster in order to start writing data ASAP since we can afford
spending more time between writes to compress the data
at that point.

[...]

+ /*
+		 * If we decided to cache the delta data, then it is best
+		 * to compress it right away.  First because we have to do
+		 * it anyway, and doing it here while we're threaded will
+		 * save a lot of time in the non threaded write phase,
+		 * as well as allow for caching more deltas within
+		 * the same cache size limit.
+		 * ...
+		 * But only if not writing to stdout, since in that case
+		 * the network is most likely throttling writes anyway,
+		 * and therefore it is best to go to the write phase ASAP
+		 * instead, as we can afford spending more time compressing
+		 * between writes at that moment.
+		 */
+		if (entry->delta_data && !pack_to_stdout) {
+			entry->z_delta_size = do_compress(&entry->delta_data,
+							  entry->delta_size);
+			cache_lock();
+			delta_cache_size -= entry->delta_size;
+			delta_cache_size += entry->z_delta_size;
+			cache_unlock();
+		}
+
 		/* if we made n a delta, and if n is already at max
 		 * depth, leaving it in the window is pointless.  we
 		 * should evict it first.

Although I like the idea of changing the behavior if the output is likely to be throttled, I do not like the test for that condition being "is it going to stdout". This is something better suited to a command line argument.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux