Re: [PATCH v2 2/3] fast-export: improve speed by skipping blobs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff King <peff@xxxxxxxx> writes:

> On Sun, May 05, 2013 at 05:38:53PM -0500, Felipe Contreras wrote:
>
>> We don't care about blobs, or any object other than commits, but in
>> order to find the type of object, we are parsing the whole thing, which
>> is slow, specially in big repositories with lots of big files.
>
> I did a double-take on reading this subject line and first paragraph,
> thinking "surely fast-export needs to actually output blobs?".
>
> Reading the patch, I see that this is only about not bothering to load
> blob marks from --import-marks. It might be nice to mention that in the
> commit message, which is otherwise quite confusing.

I had the same reaction first, but not writing the blob _objects_
out to the output stream would not make any sense, so it was fairly
easy to guess what the author wanted to say ;-).

> I'm also not sure why your claim "we don't care about blobs" is true,
> because naively we would want future runs of fast-export to avoid having
> to write out the whole blob content when mentioning the blob again.

The existing documentation is fairly clear that marks for objects
other than commits are not exported, and the import-marks codepath
discards anything but commits, so there is no mechanism for the
existing fast-export users to leave blob marks in the marks file for
later runs of fast-export to take advantage of.  The second
invocation cannot refer to such a blob in the first place.

The story is different on the fast-import side, where we do say we
dump the full table and a later run can depend on these marks.

By discarding marks on blobs, we may be robbing some optimization
possibilities, and by discarding marks on tags, we may be robbing
some features, from users of fast-export; we might want to add an
option "--use-object-marks={blob,commit,tag}" or something to both
fast-export and fast-import, so that the former can optionally write
marks for non-commits out, and the latter can omit non commit marks
if the user do not need them. But that is a separate issue.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]