Re: [PATCH v11 08/10] convert: advise canonical UTF encoding names

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On 09 Mar 2018, at 20:11, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> 
> lars.schneider@xxxxxxxxxxxx writes:
> 
>> From: Lars Schneider <larsxschneider@xxxxxxxxx>
>> 
>> The canonical name of an UTF encoding has the format UTF, dash, number,
>> and an optionally byte order in upper case (e.g. UTF-8 or UTF-16BE).
>> Some iconv versions support alternative names without a dash or with
>> lower case characters.
>> 
>> To avoid problems between different iconv version always suggest the
>> canonical UTF names in advise messages.
>> 
>> Signed-off-by: Lars Schneider <larsxschneider@xxxxxxxxx>
>> ---
> 
> I think it is probably better to squash this to earlier step,
> i.e. jumping straight to the endgame solution.

ok!


>> diff --git a/convert.c b/convert.c
>> index b80d666a6b..9a3ae7cce1 100644
>> --- a/convert.c
>> +++ b/convert.c
>> @@ -279,12 +279,20 @@ static int validate_encoding(const char *path, const char *enc,
>> 				"BOM is prohibited in '%s' if encoded as %s");
>> 			/*
>> 			 * This advice is shown for UTF-??BE and UTF-??LE encodings.
>> +			 * We cut off the last two characters of the encoding name
>> +			 # to generate the encoding name suitable for BOMs.
>> 			 */
> 
> I somehow thought that I saw "s/#/*/" in somebody's response during
> the previous round?

Oops. Will fix!


>> 			const char *advise_msg = _(
>> 				"The file '%s' contains a byte order "
>> -				"mark (BOM). Please use %.6s as "
>> +				"mark (BOM). Please use UTF-%s as "
>> 				"working-tree-encoding.");
>> -			advise(advise_msg, path, enc);
>> +			const char *stripped = "";
>> +			char *upper = xstrdup_toupper(enc);
>> +			upper[strlen(upper)-2] = '\0';
>> +			if (!skip_prefix(upper, "UTF-", &stripped))
>> +				skip_prefix(stripped, "UTF", &stripped);
>> +			advise(advise_msg, path, stripped);
>> +			free(upper);
> 
> If this codepath is ever entered with "enc" that does not begin with
> "UTF" (e.g. "Shift_JIS", which is impossible in the current code,
> but I'll talk about future-proofing here), then neither of these
> skip_prefix will trigger, and then you'd end up suggesting to use
> "UTF-" that is nonsense.  Perhaps initialize stripped to NULL and
> force advise to segv to catch such a programmer error?

Agreed!


Thanks,
Lars





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux