Re: [PATCH] Use FIX_UTF8_MAC to enable conversion from UTF8-MAC to UTF8

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Mark Junker <mjscod@xxxxxx> writes:

> Junio C Hamano schrieb:
>
>> I do not know how Macintosh libc implements "struc dirent", but
>> this approach does not work in general.
>
> IMHO there is no need that this approach works in general because this
> is a fix for MacOSX systems only. I also use d_namlen which might not
> be available on other systems. But on MacOSX this works as expected.
>
>> yet you can obtain a path component longer than 256 bytes.
>> Apparently the library allocates longer d_name[] field than what
>> is shown to the user.
>
> This is not a problem either because on MacOSX we get decomposed UTF8
> and we always convert to composed UTF8. This means that the string
> returned from reencode_string will always be smaller than the original
> filename that had to be reencoded.

It is not quite enough that this works Ok on MacOS, if you made
FIX_UTF8_MAC definable in the Makefile.  After all some friendly
and helpful Linux folks might want to enable it with their build
trying to help debugging, right?

In the short term, as long as it safely runs without overrunning
the buffer on MacOS, then that is fine, even though we will need
some protection to prevent this code from getting compiled and
used on Linux with glibc, which does have the issue.

I was specifically talking about this "static" thing.

+static struct dirent temp;


+struct dirent *gitreaddir(DIR *dirp)
+{
+	size_t utf8_len;
+	char *utf8;
+	struct dirent *result;
+	result = readdir(dirp);
+	if (result != NULL) {
+		memcpy(&temp, result, sizeof(struct dirent));
+		utf8 = reencode_string(temp.d_name, "UTF8", "UTF8-MAC");
+		if (utf8 != NULL) {
+			utf8_len = strlen(utf8);
+			temp.d_namlen = (u_int8_t) utf8_len;
+			memcpy(temp.d_name, utf8, utf8_len + 1);
+			free(utf8);
+			result = &temp;
+		}
+	}
+	return result;
+}

You memcpy() what the library gave you in *result to the
statically allocated "temp".  d_name[] in "temp" comes from the
structure definition in the user visible include file, which
could be much shorter than what the library gave you in *result.
The structure definition I showed in my message you are
responding to illustrates the issue.  If MacOS uses a similar
trick to define d_name[256] and sometimes returns much longer
name in *result, you are truncating the name by copying only the
first part of the structure and first 256 bytes of d_name[]. 

But you have a Mac, I don't, so as long as you have verified
that their header has enough room in statically allocated "temp"
to store longest possible name that can be returned from
readdir(), the code is Ok.  I was just being cautious, as I know
the above code has a problem on one platform.

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux