Mark Junker <mjscod@xxxxxx> writes: > Junio C Hamano schrieb: > >> I do not know how Macintosh libc implements "struc dirent", but >> this approach does not work in general. > > IMHO there is no need that this approach works in general because this > is a fix for MacOSX systems only. I also use d_namlen which might not > be available on other systems. But on MacOSX this works as expected. > >> yet you can obtain a path component longer than 256 bytes. >> Apparently the library allocates longer d_name[] field than what >> is shown to the user. > > This is not a problem either because on MacOSX we get decomposed UTF8 > and we always convert to composed UTF8. This means that the string > returned from reencode_string will always be smaller than the original > filename that had to be reencoded. It is not quite enough that this works Ok on MacOS, if you made FIX_UTF8_MAC definable in the Makefile. After all some friendly and helpful Linux folks might want to enable it with their build trying to help debugging, right? In the short term, as long as it safely runs without overrunning the buffer on MacOS, then that is fine, even though we will need some protection to prevent this code from getting compiled and used on Linux with glibc, which does have the issue. I was specifically talking about this "static" thing. +static struct dirent temp; +struct dirent *gitreaddir(DIR *dirp) +{ + size_t utf8_len; + char *utf8; + struct dirent *result; + result = readdir(dirp); + if (result != NULL) { + memcpy(&temp, result, sizeof(struct dirent)); + utf8 = reencode_string(temp.d_name, "UTF8", "UTF8-MAC"); + if (utf8 != NULL) { + utf8_len = strlen(utf8); + temp.d_namlen = (u_int8_t) utf8_len; + memcpy(temp.d_name, utf8, utf8_len + 1); + free(utf8); + result = &temp; + } + } + return result; +} You memcpy() what the library gave you in *result to the statically allocated "temp". d_name[] in "temp" comes from the structure definition in the user visible include file, which could be much shorter than what the library gave you in *result. The structure definition I showed in my message you are responding to illustrates the issue. If MacOS uses a similar trick to define d_name[256] and sometimes returns much longer name in *result, you are truncating the name by copying only the first part of the structure and first 256 bytes of d_name[]. But you have a Mac, I don't, so as long as you have verified that their header has enough room in statically allocated "temp" to store longest possible name that can be returned from readdir(), the code is Ok. I was just being cautious, as I know the above code has a problem on one platform. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html