Re: [PATCH 2/5] fast-export: Introduce --inline-blobs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Ramkumar Ramachandra wrote:

> Introduce a new command-line option --inline-blobs that always inlines
> blobs instead of referring to them via marks or their original SHA-1
> hash.

Could you expand on this?  What does it mean to inline a blob (is it the
same thing as the reference manual describes as using the inline data
format with the filemodify command)?  Why do we want to always do
that?  How would a person or script choose whether to use this option?
Are there any downsides?

I ask because I would be happy to use something like this. ;-)  Thanks
for working on it.

> --- a/Documentation/git-fast-export.txt
> +++ b/Documentation/git-fast-export.txt
> @@ -90,6 +90,11 @@ marks the same across runs.
>  	resulting stream can only be used by a repository which
>  	already contains the necessary objects.
>  
> +--inline-blobs::
> +	Inline all blobs, instead of referring to them via marks or
> +	their original SHA-1 hash.  This is useful to parsers, as they
> +	don't need to persist blobs.

This explanation leaves something out, I think.  If it is useful to
parsers, that means it simplifies the syntax, so a natural conclusion
would be that parsers do not need to learn about the non-inline
syntax.  But I think the last time[1] this came up, we decided that
one should not encourage that, because it moves away from a world in
which "git fast-export", "hg fast-export", and svn-fe use one standard
format and can be used interchangeably.

[1] http://thread.gmane.org/gmane.comp.version-control.git/165237/focus=165289

Perhaps it would be better to say something to the effect that "This
can decrease the memory footprint and complexity of the work some
fast-import backends have to do"?  In other words, it's just an
optimization.

To that end, if the same blob keeps on coming up again and again, I'd
be tempted to allow making a mark for it in the future, even when
--inline-blobs is specified.  In other words, I'd prefer (unless
real-world considerations prevent it) for --inline-blobs to be a hint
or a permission instead of something binding.

> --- a/builtin/fast-export.c
> +++ b/builtin/fast-export.c
[...]
> @@ -118,7 +119,7 @@ static void handle_object(const unsigned char *sha1)
>  	char *buf;
>  	struct object *object;
>  
> -	if (no_data)
> +	if (no_data || inline_blobs)
>  		return;

Maybe inline_blobs should imply no_data internally?

>  
>  	if (is_null_sha1(sha1))
> @@ -218,7 +219,23 @@ static void show_filemodify(struct diff_queue_struct *q,
>  			if (no_data || S_ISGITLINK(spec->mode))
>  				printf("M %06o %s %s\n", spec->mode,
>  				       sha1_to_hex(spec->sha1), spec->path);
> -			else {
> +			else if (inline_blobs) {

If so, this could be something like

	int inline_me = inline_blobs && !S_ISGITLINK(spec->mode);
	...

	if (no_data || S_ISGITLINK(spec->mode)) {
		const char *dataref =
			inline_me ? "inline" : sha1_to_hex(spec->sha1);
		printf("M %06o %s %s\n", spec->mode, dataref, spec->path);
	} else {
		struct object *object = lookup_object(spec->sha1);
		printf("M %06o :%d %s\n", spec->mode,
		       get_object_mark(object), spec->path);
	}

	if (inline_blob && export_data(spec->data, spec->size))
		die_errno("Could not write blob '%s'",
				sha1_to_hex(spec->sha1));

> --- a/contrib/svn-fe/.gitignore
> +++ b/contrib/svn-fe/.gitignore
[...]
> --- a/contrib/svn-fe/Makefile
> +++ b/contrib/svn-fe/Makefile
[...]
> --- /dev/null
> +++ b/contrib/svn-fe/svn-fi.c
[...]
> --- /dev/null
> +++ b/contrib/svn-fe/svn-fi.txt

Snuck in from another patch?

Hope that helps,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]