On Sun, Mar 17, 2024 at 08:27:50AM +0000, Mohit Marathe via GitGitGadget wrote: > From: Mohit Marathe <mohitmarathe@xxxxxxxxx> > > This update modifies the variable types that are used in calls to the > `utf8_strnwidth` and `utf8_strwidth` functions. This modification is > a proactive measure in anticipation of an upcoming interface change > to these functions in the subsequent patch. It would help to mention `utf8_str{,n}width()` in the commit subject so that it's immediately obvious which functions get touched by this commit. > Signed-off-by: Mohit Marathe <mohitmarathe@xxxxxxxxx> > --- > builtin/blame.c | 6 +++--- > builtin/branch.c | 2 +- > builtin/fetch.c | 2 +- > builtin/worktree.c | 4 ++-- > column.c | 2 +- > diff.c | 8 +++++--- > gettext.c | 2 +- > gettext.h | 2 +- > pretty.c | 4 ++-- > utf8.c | 4 ++-- > wt-status.c | 4 ++-- > 11 files changed, 21 insertions(+), 19 deletions(-) > > diff --git a/builtin/blame.c b/builtin/blame.c > index db1f56de61a..a72e2552305 100644 > --- a/builtin/blame.c > +++ b/builtin/blame.c > @@ -53,7 +53,7 @@ static const char *annotate_opt_usage[] = { > }; > > static int longest_file; We also end up assigning `longest_file = num`, where `num` became a `size_t` now. We should likely change this variable's type to `size_t` as well. > -static int longest_author; > +static size_t longest_author; > static int max_orig_digits; > static int max_digits; > static int max_score_digits; > @@ -529,7 +529,7 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int > name = ci.author_mail.buf; > else > name = ci.author.buf; > - pad = longest_author - utf8_strwidth(name); > + pad = cast_size_t_to_int(longest_author - utf8_strwidth(name)); I feel like this cast here is somewhat dubious. Both `longest_author` and the return value of `utf8_strwidth()` are `size_t` now. So an alternative way to handle this would be to assign the return value to a temporary variable and then `if (longest_author > name_len) BUG(...)`. > printf(" (%s%*s %10s", > name, pad, "", > format_time(ci.author_time, > @@ -631,7 +631,7 @@ static void find_alignment(struct blame_scoreboard *sb, int *option) A few lines before this hunk we declare `longest_src_lines` and `longest_dst_lines` as `int`, but we assign `num` to them. We should fix those to become `size_t`, as well. This will bubble up so that we also have to fix `max_orig_digits` and `max_digits`. > for (e = sb->ent; e; e = e->next) { > struct blame_origin *suspect = e->suspect; > - int num; > + size_t num; > > if (compute_auto_abbrev) > auto_abbrev = update_auto_abbrev(auto_abbrev, suspect); > diff --git a/builtin/branch.c b/builtin/branch.c > index 8c2305ad2c8..321c3558f2d 100644 > --- a/builtin/branch.c > +++ b/builtin/branch.c > @@ -349,7 +349,7 @@ static int calc_maxwidth(struct ref_array *refs, int remote_bonus) > for (i = 0; i < refs->nr; i++) { > struct ref_array_item *it = refs->items[i]; > const char *desc = it->refname; > - int w; > + size_t w; This isn't sufficient as we end up assignign `max = w`, where `max` is declared as `int`. We would thus also have to change that variable and then change the return value of `calc_maxwidth()` itself to be `size_t`. Furthermore, we do `w += remote_bonus` in this function, where `remote_bonus` is of type `int`. Is it guaranteed that this variable is always positive? If so, it should likely become a `size_t`, too. Otherwise we'll need to have some safety checks that the result does not wrap. > skip_prefix(it->refname, "refs/heads/", &desc); > skip_prefix(it->refname, "refs/remotes/", &desc); > diff --git a/builtin/fetch.c b/builtin/fetch.c > index 46a793411a4..fee992c3c14 100644 > --- a/builtin/fetch.c > +++ b/builtin/fetch.c > @@ -705,7 +705,7 @@ static int refcol_width(const struct ref *ref_map, int compact_format) > max = max * 2 / 3; > > for (ref = ref_map; ref; ref = ref->next) { > - int rlen, llen = 0, len; > + size_t rlen, llen = 0, len; We assign `width = rlen` further down, but `width` is an `int`. It should likely be changed to become a `size_t` as well, which will require us to change the return value of this function. > if (ref->status == REF_STATUS_REJECT_SHALLOW || > !ref->peer_ref || > diff --git a/builtin/worktree.c b/builtin/worktree.c > index 9c76b62b02d..bdbf46fb658 100644 > --- a/builtin/worktree.c > +++ b/builtin/worktree.c > @@ -961,8 +961,8 @@ static void show_worktree_porcelain(struct worktree *wt, int line_terminator) > static void show_worktree(struct worktree *wt, int path_maxlen, int abbrev_len) > { > struct strbuf sb = STRBUF_INIT; > - int cur_path_len = strlen(wt->path); > - int path_adj = cur_path_len - utf8_strwidth(wt->path); > + size_t cur_path_len = strlen(wt->path); > + int path_adj = cast_size_t_to_int(cur_path_len - utf8_strwidth(wt->path)); > const char *reason; I was wondering whether there should be a sanity check here for whether `cur_path_len >= path_adj`. But given that both measure the length of `wt->path`, and given that `utf8_strwidth()` should only ever return a value smaller than the original length, I think this should be fine. > strbuf_addf(&sb, "%-*s ", 1 + path_maxlen + path_adj, wt->path); > diff --git a/column.c b/column.c > index 50bbccc92ee..ec874036de6 100644 > --- a/column.c > +++ b/column.c > @@ -22,7 +22,7 @@ struct column_data { > }; > > /* return length of 's' in letters, ANSI escapes stripped */ > -static int item_length(const char *s) > +static size_t item_length(const char *s) > { > return utf8_strnwidth(s, strlen(s), 1); > } The value of this function is assigned to `struct column_data::len`, which is an array of `int`. That type needs to change. > diff --git a/diff.c b/diff.c > index e50def45383..4faf151345a 100644 > --- a/diff.c > +++ b/diff.c > @@ -2629,7 +2629,8 @@ void print_stat_summary(FILE *fp, int files, > > static void show_stats(struct diffstat_t *data, struct diff_options *options) > { > - int i, len, add, del, adds = 0, dels = 0; > + int i, add, del, adds = 0, dels = 0; > + size_t len = 0; > uintmax_t max_change = 0, max_len = 0; It's somewhat weird that `max_len` is declared as `uintmax_t`. Should it become a `size_t`, as well? > int total_files = data->nr, count; > int width, name_width, graph_width, number_width = 0, bin_width = 0; > @@ -2780,7 +2781,8 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options) > char *name = file->print_name; > uintmax_t added = file->added; > uintmax_t deleted = file->deleted; > - int name_len, padding; > + size_t name_len; We compare `name_len` with `name_width`. The latter should likely become a `size_t`, as well. > + int padding; > > if (!file->is_interesting && (added + deleted == 0)) > continue; > @@ -2809,7 +2811,7 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options) > if (slash) > name = slash; > } > - padding = len - utf8_strwidth(name); > + padding = cast_size_t_to_int(len - utf8_strwidth(name)); I think we shouldn't abuse `cast_size_t_to_int()` like this but instead explicitly check whether `len >= len(name)`. > if (padding < 0) > padding = 0; > > diff --git a/gettext.c b/gettext.c > index 57facbc21ec..5a77b4f7202 100644 > --- a/gettext.c > +++ b/gettext.c > @@ -127,7 +127,7 @@ void git_setup_gettext(void) > } > > /* return the number of columns of string 's' in current locale */ > -int gettext_width(const char *s) > +size_t gettext_width(const char *s) We also need to adjust the callsite of this function. There's only a single one in "builtin/fetch.c::display_ref_update()". > { > static int is_utf8 = -1; > if (is_utf8 == -1) > diff --git a/gettext.h b/gettext.h > index 484cafa5628..f161a21b45c 100644 > --- a/gettext.h > +++ b/gettext.h > @@ -31,7 +31,7 @@ > #ifndef NO_GETTEXT > extern int git_gettext_enabled; > void git_setup_gettext(void); > -int gettext_width(const char *s); > +size_t gettext_width(const char *s); > #else > #define git_gettext_enabled (0) > static inline void git_setup_gettext(void) > diff --git a/pretty.c b/pretty.c > index bdbed4295aa..f03493c74b1 100644 > --- a/pretty.c > +++ b/pretty.c > @@ -1781,7 +1781,7 @@ static size_t format_and_pad_commit(struct strbuf *sb, /* in UTF-8 */ > > if (padding < 0) { > const char *start = strrchr(sb->buf, '\n'); > - int occupied; > + size_t occupied; > if (!start) > start = sb->buf; > occupied = utf8_strnwidth(start, strlen(start), 1); We're assigning `padding = (-padding) - occupied` next, where `padding` is of type `int`. We should add some safety checks here. > @@ -1802,7 +1802,7 @@ static size_t format_and_pad_commit(struct strbuf *sb, /* in UTF-8 */ > placeholder++; > total_consumed++; > } > - len = utf8_strnwidth(local_sb.buf, local_sb.len, 1); > + len = cast_size_t_to_int(utf8_strnwidth(local_sb.buf, local_sb.len, 1)); Why did you decide to use `cast_size_t_to_int()` instead of converting `len` to be of type `size_t`? > if (c->flush_type == flush_left_and_steal) { > const char *ch = sb->buf + sb->len - 1; > diff --git a/utf8.c b/utf8.c > index 6bfaefa28eb..8ccdf684e07 100644 > --- a/utf8.c > +++ b/utf8.c > @@ -466,7 +466,7 @@ int utf8_fprintf(FILE *stream, const char *format, ...) > > columns = fputs(buf.buf, stream); > if (0 <= columns) /* keep the error from the I/O */ > - columns = utf8_strwidth(buf.buf); > + columns = cast_size_t_to_int(utf8_strwidth(buf.buf)); > strbuf_release(&buf); > return columns; > } > @@ -806,7 +806,7 @@ void strbuf_utf8_align(struct strbuf *buf, align_type position, unsigned int wid > const char *s) > { > size_t slen = strlen(s); > - int display_len = utf8_strnwidth(s, slen, 0); > + size_t display_len = utf8_strnwidth(s, slen, 0); > int utf8_compensation = slen - display_len; > > if (display_len >= width) { A few lines further down we have: int left = (width - dispay_len) / 2; This should be changed to become a `size_t`. > diff --git a/wt-status.c b/wt-status.c > index 7108a92b52c..c847b4bb5ed 100644 > --- a/wt-status.c > +++ b/wt-status.c > @@ -325,7 +325,7 @@ static int maxwidth(const char *(*label)(int), int minval, int maxval) > > for (i = minval; i <= maxval; i++) { > const char *s = label(i); > - int len = s ? utf8_strwidth(s) : 0; > + size_t len = s ? utf8_strwidth(s) : 0; > if (len > result) > result = len; `result` is of type `int`, so we'd have to change both its type as well as the return type of `maxwidth()`. > } > @@ -341,7 +341,7 @@ static void wt_longstatus_print_unmerged_data(struct wt_status *s, > static char *padding; > static int label_width; > const char *one, *how; > - int len; > + size_t len; > > if (!padding) { > label_width = maxwidth(wt_status_unmerged_status_string, 1, 7); `len` gets passed to the formatter "%.*s". Quoting fprintf(3p): A field width, or precision, or both, may be indicated by an <asterisk> ('*'). In this case an argument of type int supplies the field width or precision. So this must be an `int` or otherwise bad things will happen depending on the size of those types on your platform. Patrick
Attachment:
signature.asc
Description: PGP signature