Re: [PATCH 4/5] fast-export: Introduce --inline-blobs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Jonathan Nieder writes:
> Junio C Hamano wrote:
> > Ramkumar Ramachandra <artagnon@xxxxxxxxx> writes:
> 
> >> Introduce a new command-line option --inline-blobs that always inlines
> >> blobs instead of referring to them via marks or their original SHA-1
> >> hash.
> [...]
> > Hmm, this smells somewhat fishy.
> >
> > Wasn't G-F-I designed to be a common stream format for other SCMs to
> > generate streams, so that importers and exporters can be written once for
> > each SCM to interoperate?
> 
> Here is one way to sell it:
> 
> 	With the inline blobs feature, fast-import backends have to
> 	maintain less state.  Using it should speed up exporting.
> 
> 	This is made optional because ...
> 
> I haven't thought through whether it ought to be optional or measured
> the effect on import performance.

It simplifies other fast-import backends greatly, because persisting
blobs can be complicated and expensive. I was thinking of making
svn-fe support both inlined blobs, and blobs referenced by marks. When
it's possible to be cheap by optionally having inlined blobs, why not
optionally have them? The filter we develop later can be used for
older fast-import streams that don't have inlined blobs.

On a related note, does it make sense to version our fast-import
stream format? It's certainly going to keep evolving with time, and we
need backward compatibility.

> A separate question is what an svn fast-import backend should do with
> all those blobs that are not ready to be written to dump.  As a hack
> while prototyping, one can rely on the "current" fast-export output,
> even though that is not flexible or futureproof.  Longer term, the
> folllowing sounds very interesting

Good point. The functionality to persist blobs that are refenced by
marks probably shouldn't be in svn-fe at all.

> > Just thinking aloud, but is it possible to write a filter that converts an
> > arbitrary G-F-I stream with referenced blobs into a G-F-I stream without
> > referenced blobs by inlining all the blobs?
> 
> to avoid complexity in the svn fast-import backend itself.
> (Complicating detail: such a filter would presumably take responsibility
> for --export-marks, so it might want a way to retrieve commit marks
> from its downstream.)

This filter will need to persist every blob for the entire lifetime of
the program. We can't possibly do it in-memory, so we have to find
some way to persist them on-disk and retrieve them very
quickly. Jonathan suggested using something like ToyoCabinet earlier-
I'll start working and see what I come up with.

-- Ram
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]