Re: [PATCH] Add ls-files --eol-staged, --eol-worktree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Oct 17, 2015 at 4:18 PM, Torsten Bögershausen <tboegi@xxxxxx> wrote:
> Make it possible to show the line endings of files.
> Files which are staged and/or files in the working tree:
>
> git ls-files --eol-staged
> git ls-files --eol-worktree
>
> Both will show an output like this:
>
> empty    empty_file
> bin      binary_file_or_with_cr_handled_as_binary
> txt-crlf text_file_with_crlf
> txt-mix  text_file_with_crlf_and_lf
> txt-lf   text_file_with_lf
> txt      text_file_with_no_eol_at_all
>
> Implementation details:
> Make struct text_stat, is_binary() and gather_stats() from convert.c
> public, add a new function get_convert_stats_ascii() and use it
> in and use them in ls-files.

s/and use it in and use them in/and use them in/

> git ls-files --eol-staged will give a line like this:

"... line like this:" what?

> Signed-off-by: Torsten Bögershausen <tboegi@xxxxxx>
> ---
> --- a/builtin/ls-files.c
> +++ b/builtin/ls-files.c
> @@ -68,6 +70,11 @@ static void show_dir_entry(const char *tag, struct dir_entry *ent)
>                 return;
>
>         fputs(tag, stdout);
> +       if (show_eol_wt) {
> +               printf("%s ", get_convert_stats_ascii(ent->name,
> +                                                                                                                                                                       GET_CONVERT_STATS_ASCII_WT, 0));

Whitespace-damaged patch?

> +       }

Style: unnecessary braces

> +
>         write_name(ent->name);
> }
>
> @@ -170,6 +177,14 @@ static void show_ce_entry(const char *tag, const struct cache_entry *ce)
>                        find_unique_abbrev(ce->sha1,abbrev),
>                        ce_stage(ce));
>         }
> +       if (show_eol_staged) {
> +               printf("%s ",
> +                                        get_convert_stats_ascii(ce->name, GET_CONVERT_STATS_ASCII_BLOB, 0));
> +       }
> +       if (show_eol_wt) {
> +               printf("%s ", get_convert_stats_ascii(ce->name,GET_CONVERT_STATS_ASCII_WT,
> +                                                                                                                                                                       ce->ce_stat_data.sd_size));
> +       }

Style: unnecessary braces (both cases)

>         write_name(ce->name);
>         if (debug_mode) {
>                 const struct stat_data *sd = &ce->ce_stat_data;
> @@ -206,6 +221,10 @@ static void show_ru_info(void)
>                         printf("%s%06o %s %d\t", tag_resolve_undo, ui->mode[i],
>                                find_unique_abbrev(ui->sha1[i], abbrev),
>                                i + 1);
> +                       if (show_eol_wt) {
> +                               printf("%s ",
> +                                                        get_convert_stats_ascii(path, GET_CONVERT_STATS_ASCII_WT, 0));
> +                       }

Style: unnecessary braces

>                         write_name(path);
>                 }
>         }
> @@ -433,6 +452,8 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
>                 OPT_BIT(0, "directory", &dir.flags,
>                         N_("show 'other' directories' names only"),
>                         DIR_SHOW_OTHER_DIRECTORIES),
> +               OPT_BOOL(0, "eol-staged", &show_eol_staged, N_("show line endings of the staged file")),
> +               OPT_BOOL(0, "eol-worktree", &show_eol_wt, N_("show line endings of the file in work tree")),
>                 OPT_NEGBIT(0, "empty-directory", &dir.flags,
>                         N_("don't show empty directories"),
>                         DIR_HIDE_EMPTY_DIRECTORIES),
> --- a/convert.c
> +++ b/convert.c
> @@ -95,6 +87,45 @@ static int is_binary(unsigned long size, struct text_stat *stats)
>         return 0;
> }
>
> +

Style: unnecessary blank line

> +const char *gather_stats_ascii(const char *data, unsigned long size)

Is this name too generic? This implementation is specialized for
summarizing EOL state, but it is conceivable that there may be other
meaningful textual representations of struct text_stat, as well, so
perhaps this name ought to better reflect the EOL-centric textual
representation of this implementation.

> +{
> +       struct text_stat stats;
> +       if (!data || !size)
> +               return("empty   ");

Would it make sense to distinguish between an empty file/blob and one
which was not found?

    if (!data)
        return "missing";
    if (!size)
        return "empy";

> +       gather_stats(data, size, &stats);
> +       if (is_binary(size, &stats))
> +               return("bin     ");

Style: unnecessary parentheses on 'return'

> +       else if (stats.cr != stats.crlf)

Style (nit): unnecessary 'else'

> +               return("bin     ");
> +       else if (stats.crlf && (stats.crlf == stats.lf))
> +               return("txt-crlf");
> +       else if (stats.crlf && stats.lf)
> +               return("txt-mix ");
> +       else if (stats.lf)
> +               return("txt-lf  ");
> +       else
> +               return("txt     ");
> +}

Is it a good idea for this otherwise general purpose function to be
conflating presentation with EOL classification? I would have expected
this function to return the raw classification string without trailing
whitespace ("empty", "bin", "txt-mix", etc.), and leave it up to the
caller to pad the string if alignment is desired.

> +const char *get_convert_stats_ascii(const char *path, int flags, size_t hint)

Is 'flags' an appropriate name? Presently, it's more of a "mode" than
a flag. If it is a flag, then you typically would use an unsigned
type.

If you can't foresee any additional "flags", and this really is a
"mode", then it might be clearer to split this into two functions --
one for blobs and one for files -- and callers can invoke whichever is
appropriate.

> +{
> +       const char *ret = "";

Unnecessary initialization since all  branches assign 'ret' unconditionally.

> +       if (flags & GET_CONVERT_STATS_ASCII_BLOB) {
> +               unsigned long sz;
> +               void *data = read_blob_data_from_cache(path, &sz);
> +               ret = gather_stats_ascii(data, sz);

gather_stats_ascii() deals gracefully with data== NULL, so it can
safely be outside the 'if (data)' conditional. Okay.

> +               if (data)
> +                       free(data);

POSIX says that free(NULL) is safe, so the conditional is unnecessary.

> +       } else if (flags & GET_CONVERT_STATS_ASCII_WT){

Style: space before {

> +               struct strbuf sb = STRBUF_INIT;
> +               strbuf_read_file(&sb, path, hint);

If gather_stats_ascii() should distinguish error and empty-file cases,
then checking the return value of strbuf_read_file() would be
appropriate.

> +               ret = gather_stats_ascii(sb.buf, sb.len);
> +               strbuf_release(&sb);
> +       }
> +       return ret;
> +}
> +
> --- a/convert.h
> +++ b/convert.h
> @@ -31,6 +31,20 @@ enum eol {
> +struct text_stat {
> +       /* NUL, CR, LF and CRLF counts */
> +       unsigned nul, cr, lf, crlf;
> +
> +       /* These are just approximations! */
> +       unsigned printable, nonprintable;
> +};
> +void gather_stats(const char *buf, unsigned long size, struct text_stat *stats);
> +int is_binary(unsigned long size, struct text_stat *stats);
> +const char *gather_stats_ascii(const char *buf, unsigned long size);
> +#define GET_CONVERT_STATS_ASCII_BLOB           (1<<0)
> +#define GET_CONVERT_STATS_ASCII_WT             (1<<1)
> +const char *get_convert_stats_ascii(const char *path, int isBlob, size_t hint);

s/isBlob/flags/ or s/isBlob/mode/ or something.

Having read the source code, I know what 'hint' is for, but its
purpose is not obvious when just reading this prototype. Also, do you
expect 'hint' to prove worthwhile or might it be a case of premature
optimization and interface complexity? (Genuine question.)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]