RE: git --archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





> "brian m. carlson" <sandals@xxxxxxxxxxxxxxxxxxxx> writes:
>
>> Maybe they can technically be stored in any order, but people don't 
>> want git archive to produce non-deterministic archives...
>> ...  I feel like it would be very difficult to achieve the speedups 
>> you want and still produce a deterministic archive.
>
> I am not going to work on it myself, but I think the only possible parallelism would come from making the reading for F(n+1) and subsequent objects overlap writing of F(n), given a deterministic order of files in the resulting archive.  When we decide which file should come first, and learns that it is F(0), it probably comes the tree object of the root level, and it is very likely that we would already know what F(1) and F(2) are by that time, so it should be possible to dispatch reading and applying content filtering on F(1) and keeping the result in core, while we are still writing F(0) out.
>
> Thanks.

Yes. But even preceeding any changes in the actual tree traversal to collect the objects one-by-one as currently, a "simple" parallelized, recursive walk over all objects, pseudo-randomly reading a fraction of the data (mostly directories, but also files to update all the (externally) cached inode metadata, should help. As long as this stage is highly parallelizes, it's cost (in time) would be recovered in a much faster single-threaded tree recursion just as exists currently.

That is not to say, that the above method wouldn't be a significant improvement again 😊




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux