Re: [PATCH/RFC 1/1] Auto diff of UTF-16 files in UTF-8

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 26, 2018 at 06:27:06PM +0100, tboegi@xxxxxx wrote:

> @@ -3611,8 +3615,25 @@ int diff_populate_filespec(struct diff_filespec *s, unsigned int flags)
>  			s->size = size;
>  			s->should_free = 1;
>  		}
> -	}
> -	else {
> +		if (!s->binary && buffer_is_binary(s->data, s->size) &&
> +		    buffer_has_utf16_bom(s->data, s->size)) {
> +			int outsz = 0;
> +			char *outbuf;
> +			outbuf = reencode_string_len(s->data, (int)s->size,
> +						     "UTF-8", "UTF-16", &outsz);
> +			if (outbuf) {
> +				if (s->should_free)
> +					free(s->data);
> +				if (s->should_munmap)
> +					munmap(s->data, s->size);
> +				s->should_munmap = 0;
> +				s->data = outbuf;
> +				s->size = outsz;
> +				s->reencoded_from_utf16 = 1;
> +				s->should_free = 1;
> +			}
> +		}
> +	} else {

I don't think it makes sense to do the conversion deep inside
diff_populate_filespec(), because it will kick in much more than you'd
want (e.g., "format-patch | am" would stop working with this patch, I
think).

I think you'd want to hook this in at the same level as fill_textconv().
In fact, one way to do it would be to have the get_textconv() stack just
fill in a special driver that does the auto-detection. This is similar
to my earlier patch, but it avoids overriding the user-facing config for
non-textconv things (and naturally any actual configured textconv filter
would override the auto-detection).

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux