On 2025-01-06 at 19:08:52, Sören Krecker wrote: > Fix some compiler warings from msvw in add-patch.c for value truncation > form 64 bit to 32 bit integers.Change unsigned long to size_t for > correct variable size on linux and windows. > Add macro strtos for converting a string to size_t. > Test if convertion fails with over or underflow. A few minor nits here. We want to say "from" both here and in the title and "conversion" (and in the title, "mismatch"), and put a space after the period in a sentence. I think you meant "MSVC" instead of "msvw", but if not, please do explain what that is, since I'm not familiar with it and I'm curious. The commit message is a good place to explain lots in detail. > Signed-off-by: Sören Krecker <soekkle@xxxxxxxxxx> > > Uses strtouq I don't see that we're using this function. > impove linux support > > Change Macro name We don't typically put comments about the revisions we've made to a patch in the commit message. We may put them below the --- so that they're visible to readers and reviewers, which is helpful, but we pretend that our patches were perfect to begin with in terms of the commit message, since the future reader of the history only cares about the actual end result and not what changes we made along the way. > --- > add-patch.c | 53 +++++++++++++++++++++++++++-------------------- > gettext.h | 2 +- > git-compat-util.h | 6 ++++++ > 3 files changed, 38 insertions(+), 23 deletions(-) > > diff --git a/add-patch.c b/add-patch.c > index 7b598e14df..67a7f68d23 100644 > --- a/add-patch.c > +++ b/add-patch.c > @@ -242,7 +242,7 @@ static struct patch_mode patch_mode_worktree_nothead = { > }; > > struct hunk_header { > - unsigned long old_offset, old_count, new_offset, new_count; > + size_t old_offset, old_count, new_offset, new_count; > /* > * Start/end offsets to the extra text after the second `@@` in the > * hunk header, e.g. the function signature. This is expected to > @@ -322,11 +322,12 @@ static void setup_child_process(struct add_p_state *s, > } > > static int parse_range(const char **p, > - unsigned long *offset, unsigned long *count) > + size_t *offset, size_t *count) > { > char *pend; > - > - *offset = strtoul(*p, &pend, 10); > + *offset = strtos(*p, &pend, 10); I see you've defined this below. > + if (errno == ERANGE) > + return error("Number dose not fit datatype"); I think the word you wanted was "does". However, perhaps we should provide a better, more meaningful error message so the user knows what data they provided that was invalid. Maybe "absurdly large value in diff header range"? It would be quite bizarre to get a value even as large as the maximum value of a 32-bit integer, and I don't think our diff code can even handle values larger than INT_MAX. In that context, it might not even be necessary to handle values larger than unsigned long, since we can't generate them. However, in the interests of compatibility with other implementations which might not have that limitation, size_t seems reasonable as a choice to handle more generally. Assuming we keep this, we probably also want to mark this for translation by wrapping it in `_(` and `)`. I also don't think this order is correct. In general, errno is not reset implicitly, so unless we know that an error occurred, errno is meaningless, since another function could have set it to ERANGE. We'd probably need to save errno, set it to 0, and restore to verify that we got the right value, since we can't distinguish here between a truncated value for range reasons and for other reasons. > if (pend == *p) > return -1; > > if (*pend != ',') { > @@ -334,7 +335,9 @@ static int parse_range(const char **p, > *p = pend; > return 0; > } > - *count = strtoul(pend + 1, (char **)p, 10); > + *count = strtos(pend + 1, (char **)p, 10); > + if (errno == ERANGE) > + return error("Number dose not fit datatype"); Same comment here. > return *p == pend + 1 ? -1 : 0; > } > > @@ -673,8 +676,8 @@ static void render_hunk(struct add_p_state *s, struct hunk *hunk, > */ > const char *p; > size_t len; > - unsigned long old_offset = header->old_offset; > - unsigned long new_offset = header->new_offset; > + size_t old_offset = header->old_offset; > + size_t new_offset = header->new_offset; > > if (!colored) { > p = s->plain.buf + header->extra_start; > @@ -700,12 +703,14 @@ static void render_hunk(struct add_p_state *s, struct hunk *hunk, > else > new_offset += delta; > > - strbuf_addf(out, "@@ -%lu", old_offset); > + strbuf_addf(out, "@@ -%" PRIuMAX, (uintmax_t)old_offset); > if (header->old_count != 1) > - strbuf_addf(out, ",%lu", header->old_count); > - strbuf_addf(out, " +%lu", new_offset); > + strbuf_addf(out, ",%" PRIuMAX, > + (uintmax_t)header->old_count); > + strbuf_addf(out, " +%" PRIuMAX, (uintmax_t)new_offset); If we're using size_t, we can use %zu. That's specified in C99 as the appropriate formatting type for size_t, and we require C99 or C11 for all systems. We don't need to cast to uintmax_t. > diff --git a/gettext.h b/gettext.h > index 484cafa562..d36f5a7ade 100644 > --- a/gettext.h > +++ b/gettext.h > @@ -53,7 +53,7 @@ static inline FORMAT_PRESERVING(1) const char *_(const char *msgid) > } > > static inline FORMAT_PRESERVING(1) FORMAT_PRESERVING(2) > -const char *Q_(const char *msgid, const char *plu, unsigned long n) > +const char *Q_(const char *msgid, const char *plu, size_t n) > { > if (!git_gettext_enabled) > return n == 1 ? msgid : plu; > diff --git a/git-compat-util.h b/git-compat-util.h > index e283c46c6f..4c33990a05 100644 > --- a/git-compat-util.h > +++ b/git-compat-util.h > @@ -291,6 +291,12 @@ static inline int _have_unix_sockets(void) > #ifdef HAVE_BSD_SYSCTL > #include <sys/sysctl.h> > #endif > +#if defined _WIN64 > +# define strtos strtoull > +#else > +#define strtos strtoul > +#endif This is not a great name for the function. First of all, it resembles the standard functions a lot, so it's something that POSIX could standardize or an OS could add, and then we'll have some fun compilation errors when we redefine things. Second, it's a lot less future-proof. While I do agree that only Windows 64-bit systems are likely to fall into this case, since we already include <limits.h>, we probably should do this: #if SIZE_MAX == ULONG_MAX #define str_to_size_t strtoul #else #define str_to_size_t strtoull #endif (or whatever you want to call the function). That expresses what we care about—that the type is suitable for the value we want to parse—and doesn't use the OS as a proxy for that data. Otherwise, the Unix developer who doesn't use Windows may not understand _why_ Windows is special and the reason we've chosen this change. On that note, it would be helpful if you explained in the commit message why that is for people who don't know. Maybe something like this: On 64-bit systems, size_t is a 64-bit type. On most Unix systems, unsigned long is also 64 bits in size, so we can use functions for that type to parse values of size_t. However, on Windows, unsigned long is always 32 bits, and if we want a 64-bit type, we must use unsigned long long. To future-proof our changes against other platforms that might be added in the future, we first check if unsigned long is sufficient, and otherwise, use unsigned long long, which will work in both cases. Of course, please feel free to edit as you see fit. -- brian m. carlson (they/them or he/him) Toronto, Ontario, CA
Attachment:
signature.asc
Description: PGP signature