Re: [PATCH 1/1] convert: tighten the safe autocrlf handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 24, 2017 at 11:14 AM,  <tboegi@xxxxxx> wrote:
> When a text file had been commited with CRLF and the file is commited
> again, the CRLF are kept if .gitattributs has "text=auto".
> This is done by analyzing the content of the blob stored in the index:
> If a '\r' is found, Git assumes that the blob was commited with CRLF.
>
> The simple search for a '\r' does not always work as expected:
> A file is encoded in UTF-16 with CRLF and commited. Git treats it as binary.
> Now the content is converted into UTF-8. At the next commit Git treats the
> file as text, the CRLF should be converted into LF, but isn't.
>
> Solution:
> Replace has_cr_in_index() with has_crlf_in_index(). When no '\r' is found,
> 0 is returned directly, this is the most common case.
> If a '\r' is found, the content is analyzed more deeply.
>
> Signed-off-by: Torsten Bögershausen <tboegi@xxxxxx>
> ---
> diff --git a/convert.c b/convert.c
> @@ -220,18 +220,27 @@ static void check_safe_crlf(const char *path, enum crlf_action crlf_action,
> -static int has_cr_in_index(const struct index_state *istate, const char *path)
> +static int has_crlf_in_index(const struct index_state *istate, const char *path)
>  {
>         unsigned long sz;
>         void *data;
> -       int has_cr;
> +       const char *crp;
> +       int has_crlf = 0;
>
>         data = read_blob_data_from_index(istate, path, &sz);
>         if (!data)
>                 return 0;
> -       has_cr = memchr(data, '\r', sz) != NULL;
> +
> +       crp = memchr(data, '\r', sz);
> +       if (crp && (crp[1] == '\n')) {

If I understand correctly, this isn't a NUL-terminated string and it
might be a binary blob, so if the lone CR in a file resides at the end
of the file, won't this try looking for LF out-of-bounds? I would have
expected the conditional to be:

    if (crp && crp - data + 1 < sz && crp[1] == '\n') {

or any equivalent variation.

> +               unsigned int ret_stats;
> +               ret_stats = gather_convert_stats(data, sz);
> +               if (!(ret_stats & CONVERT_STAT_BITS_BIN) &&
> +                   (ret_stats & CONVERT_STAT_BITS_TXT_CRLF))
> +                       has_crlf = 1;
> +       }
>         free(data);
> -       return has_cr;
> +       return has_crlf;
>  }




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux