Re: [PATCH 4/4] fast-import: only store commit objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/06/2013 12:32 PM, Thomas Rast wrote:
> Michael Haggerty <mhagger@xxxxxxxxxxxx> writes:
> 
>> On 05/03/2013 08:23 PM, Felipe Contreras wrote:
>>> On Fri, May 3, 2013 at 12:56 PM, Thomas Rast <trast@xxxxxxxxxxx> wrote:
>>>> Felipe Contreras <felipe.contreras@xxxxxxxxx> writes:
>>>
>>>> How do we know that this doesn't break any users of fast-import?  Your
>>>> comment isn't very reassuring:
>>>>
>>>>> the vast majority of them will never be used again
>>>>
>>>> So what's with the minority?
>>>
>>> Actually I don't think there's any minority. If the client program
>>> doesn't store blobs, the blob marks are not used anyway. So there's no
>>> change.
>>
>> I haven't been following this conversation in detail, but your proposed
>> change sounds like something that would break cvs2git [1].  Let me
>> explain what cvs2git does and why:
>>
>> CVS stores all of the revisions of a single file in a single filename,v
>> file in rcsfile(5) format.  The revisions are stored as deltas ordered
>> so that a single revision can be reconstructed from a single serial read
>> of the file.
>>
>> cvs2git reads each of these files once, reconstructing *all* of the
>> revisions for a file in a single go.  It then pours them into a
>> git-fast-import stream as blobs and sets a mark on each blob.
>>
>> Only much later in the conversion does it have enough information to
>> reconstruct tree-wide commits.  At that time it outputs git-fast-import
>> data (to a second file) defining the git commits and their ancestry.
>> The contents are defined by referring to the marks of blobs from the
>> first git-fast-import stream file.
>>
>> This strategy speeds up the conversion *enormously*.
>>
>> So if I understand correctly that you are proposing to stop allowing
>> marks on blob objects to be set and/or referred to later, then I object
>> vociferously.
> 
> The proposed patch wants to stop writing marks (in --export-marks) for
> anything but commits.  Does cvs2git depend on that?  I.e., are you using
> two separate fast-import processes for the blob and tree/commit phases
> you describe above?

Yes, it can be handy to start loading the first "blobfile" in parallel
with the later stages of the conversion, before the second "dumpfile" is
ready.  In that case the user needs to pass --export-marks to the first
fast-import process to export marks on blobs so that the marks can be
passed to the second fast-import via --import-marks.

So the proposed change would break a documented use of cvs2git.

Making the export of blob marks optional would of course be OK, as long
as the default is to export them.

Michael


-- 
Michael Haggerty
mhagger@xxxxxxxxxxxx
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]