Re: [PATCH v3 4/4] commit.c: Fix type conversation warnings from msvc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Sören

On 26/01/2025 13:00, Sören Krecker wrote:

-	const int tree_entry_len = the_hash_algo->hexsz + 5;
-	const int parent_entry_len = the_hash_algo->hexsz + 7;
+	const size_t tree_entry_len = the_hash_algo->hexsz + 5;
+	const size_t parent_entry_len = the_hash_algo->hexsz + 7;

As Junio has previously pointed out it might make more sense to change the type of the_hash_algo->hexsz. What is the advantage of using size_t here?

  int add_header_signature(struct strbuf *buf, struct strbuf *sig, const struct git_hash_algo *algo)
  {
-	int inspos, copypos;
+	ssize_t inspos, copypos;

Note that POSIX allows "ssize_t" to be narrower than "int". We have at least one platform where that is the case [see c14e5a1a501 (transport-helper: use xread instead of read, 2019-01-03)] so I'm not sure this change is a good idea.

  	const char *eoh;
  	const char *gpg_sig_header = gpg_sig_headers[hash_algo_by_ptr(algo)];
-	int gpg_sig_header_len = strlen(gpg_sig_header);
+	size_t gpg_sig_header_len = strlen(gpg_sig_header);

It is really unfortunate that the compiler cannot deduce that there is not any truncation here as the longest string is "gpgsig-sha256". I wonder if we should add a helper for cases like this something like

	int strlen_int(const char *s) {
		return cast_size_t_to_int(strlen(s));
	}

rather than having to deal with the fallout from changing "int" to "size_t"

-static int find_invalid_utf8(const char *buf, int len)
+static int find_invalid_utf8(const char *buf, size_t len)

I think changing the type of "len" is an improvement (even if it hard to see anyone creating such a long commit message) as it matches the buf->len in verify_utf8() which is the only caller of this function. However the conversion is incomplete. If we are to accommodate buffers longer than INT_MAX we need to change the return type as well. As it stands bad_offset is changed to size_t but truncated to int when the function returns. Patrick has already pointed out that the type of "offset" needs to match "len" as the loop does

	len--;
	offset++;

and

	bad_offset = offset - 1;

verify_utf8() uses a variable of type long to track the position so that should be changed as well if we're really going to support size_t length buffers.

I think that a good approach to fixing these warnings would be to ask "does it make sense to use size_t here?". If the answer is "yes" then we should focus on converting the function to work correctly with size_t rather than on fixing the compiler warnings. That way we are more likely to avoid subtle bugs like this as our focus is on the conversion rather than the compiler warnings which will be fixed as a by-product of the conversion. If the answer to the question is "no" then we should look to change the types of the other variable in the assignment or use something like cast_size_t_to_int() or case_size_t_to_ulong() as appropriate. There is a danger that we adopt an approach of "change the type to size_t to silence the warnings" which leads to code that looks like it handles size_t correctly but in fact contains subtle bugs.

Best Wishes

Phillip


  {
  	int offset = 0;
  	static const unsigned int max_codepoint[] = {
@@ -1539,7 +1539,7 @@ static int find_invalid_utf8(const char *buf, int len)
while (len) {
  		unsigned char c = *buf++;
-		int bytes, bad_offset;
+		size_t bytes, bad_offset;
  		unsigned int codepoint;
  		unsigned int min_val, max_val;





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux