Re: [RFC/PATCH 0/7] user-configurable git-archive output formats

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jun 18, 2011 at 04:52:02PM +0200, René Scharfe wrote:

> >> The gzip path is not configurable at all. Probably it should read the
> >> path and arguments from the config file. In fact, we could even allow
> >> arbitrary config like:
> >>
> >>   [tarfilter "tgz"]
> >>     command = gzip -c
> >>     extension = tgz
> >>     extension = tar.gz
> 
> Configuration options whose values are appended instead of overwritten
> by duplicate definitions are a new concept for git, I think.  Perhaps
> it's not a big thing, but I think it's better avoided.
> 
> The only (stupid) practical shortcoming I can think if is this, though:
> You can't remove anything from the list of supported extensions in a
> user config if the system config already contains e.g. tgz and tar.gz.

Yeah, I have mixed feelings on that.

As Jakub pointed out, we already have them in several places. I don't
know that removal is that big a deal in this instance. If we did want to
support it, I think it would make more sense to have a generic solution
at the config level, like:

  [some-section]
    multivalue = foo
    multivalue = bar
    !multivalue
    multivalue = baz
    multivalue = whee

at which point the value is ("baz", "whee"). That matches what we do on
the command line, where:

  git foo --multivalue=foo --multivalue=bar --no-multivalue \
          --multivalue=baz --multivalue=whee

handles the same issue in a similar way.

The other option, of course, is having a single value with list
semantics. But then you have to invent separator syntax. In this
instance whitespace would probably be fine, but I'd rather that each new
multi-valued option did not invent its own syntax, and in the general
case you may need to handle quoting. Plus you may need some kind of
append syntax. For example, if we support "tgz" and "tar.gz" internally,
how do you say 'add "pax.gz"' to that list without reiterating the whole
list?

> The pax format is identical to the ustar format, which --format=tar
> produces.  The other major format that comes to mind is cpio.  The
> (never merged) predecessor of tar-tree actually used that format.

Thanks, cpio is probably the most likely example.

> Since then I have been waiting for users to request being able to export
> using cpio format (which is simpler and slightly smaller than tar), but
> that never happened.  It seems the existence of the pax format really
> has pacified the tar vs. cpio war of old.

Fair enough. I haven't heard anybody clamoring for it either. I just
didn't want to paint us into a corner. Since it seems like the most
likely format and nobody really wants it, it's perhaps not worth
worrying about.

> I'm not sure "filter" is a good name, though.  We have core.pager, which
> is technically a filter as well, but for a specific purpose.

Yeah, any name would have to be "archive filter" or similar. But I would
think being under the "tar" section would be enough to disambiguate it.

> And we have the tar.umask setting as a precedence for format specfic
> config options.  So how about tar.<extension>.compressor?
> 
> 	[tar "tgz"]
> 		compressor = gzip -cn
> 	[tar "tar.gz"]
> 		compressor = gzip -cn
> 	[tar "tar.bz2"]
> 		compressor = bzip2 -c

My two complaints are:

  1. The user has to repeat themselves in describing the command for
     multiple extensions. In practice, that's probably not a big deal,
     though.

  2. The namespace for user-defined extensions is the same as the
     namespace for tar options. I guess we can disambiguate based on the
     number of dots (so, e.g., I know that "tar.umask" is not the umask
     extension, because it doesn't have a third component). It does
     limit us a little bit for adding future options.

     I don't know if it's worth caring about. We have the same problem
     with the diff.* namespace (e.g., diff.color.* exists, but is not a
     userdiff driver). In that case, besides the code being a little
     careful to be tolerant of the clash, I don't think it has been a
     problem.

> We don't need a compressionlevels option here because we can simply
> assume that the compressor commands do support them.

But we discussed elsewhere the concept of a tar-to-7z filter. I'm not
sure I'd call that a "compressor" as much as a filter. And it wouldn't
want the compression-level options (or maybe you would; I don't use it,
but skimming the manpage, it looks like you would want to convert -5
into "-mx=5"; so maybe you would want a wrapper script anyway).

> (Side note: this is not fully true for bzip2, as it doesn't support
> -0, but I don't think this is worth special consideration in our code,
> as long as errors of the filter are displayed properly.)

Yeah, I think that can be ignored. bzip can take care of complaining
itself.

> And we can also add a config option to restrict the formats creatable by
> upload-archive, to address concerns over DoS attacks with expensive
> compressors:
> 
> 	[archive]
> 		remoteFormats = tar zip tgz tar.gz

Right. It does have the ad-hoc list syntax I complained about above,
though.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]