Re: [PATCH v1] convert: add support for 'encoding' attribute

Lars Schneider <larsxschneider@xxxxxxxxx> · Wed, 3 Jan 2018 21:45:13 +0100

On 03 Jan 2018, at 20:15, Junio C Hamano <gitster@xxxxxxxxx> wrote:

> Torsten Bögershausen <tboegi@xxxxxx> writes:
> 
>> May be.
>> Originally utf8.c was about encoding and all kind of UTF-8 related stuff.
>> Especially it didn't know anything about strbuf.
>> I don't know why strbuf.h and other functions had been added here,
>> 
>> I once moved them into strbuf.c without any problems, but never send out
>> a patch, because of possible merge conflicts in ongoing patches.
>> 
>> In any case, if it is about strbuf, I would try to put it into strbuf.c
> 
> Please don't.
> 
> A code that happens to use strbuf as a container and about
> manipulating the contents is quite different from a code about
> strbuf.  The latter is to enhance and extend how the strbuf as a
> container behaves.  An operation about character encoding for a
> string that happens to be stored in strbuf is more about the
> encoding, and much much less about strbuf.
> 
> convert.c is about massaging contents coming from the outside world
> into a shape stored in Git and the other way around, and there are
> multiple ways the contents are massaged.  EOL convention may be
> adjusted, characters may be reencoded, end-user defined conversion
> may be applied.  Some of these operations may use helpers specific
> for the task from other more library-ish files, like checking if a
> string looks like encoded in UTF-8 from utf8.[ch].

Agreed. I did that in v2. See these patches:

https://public-inbox.org/git/20171229152222.39680-3-lars.schneider@xxxxxxxxxxxx/
https://public-inbox.org/git/20171229152222.39680-4-lars.schneider@xxxxxxxxxxxx/

- Lars