On Tue, Aug 20, 2019 at 4:37 PM Joe Perches <joe@xxxxxxxxxxx> wrote: > > > So I'm putting my foot down on yet another broken string copy > > interface from people who do not understand this fundamental issue. > > I think you are mistaken about the stracpy limits as > the only limit is not the source size but the dest. > > Why should the source be size limited? You just proved my point. You don't understand that sources can also be limited, and the limit on a source can be *smaller* than the limit of a destination. Did we learn *NOTHING* from the complete and utter disaster that was strlcpy()? Do you not understand why strlcpy() was unacceptably bad, and why the people who converted strncpy() to it introduced real bugs? The fact is, it's not just the destination that has a size limit. The source often has one too. And no, the source is not always guaranteed to be NUL-terminated, nor is the source buffer guaranteed to be larger than the destination buffer. Now, if you *know* that the source is smaller than the destination size, you can do: len = strnlen(src, srclen); memcpy(dst, len); dst[len] = 0; and that's not wrong, but that works only when (a) you actually do the above (b) you have no data races on src (or you at least only require that 'dst' is NUL-terminated, not that 'len' is necessarily the correct length of the result (c) you actually know as the programmer that yes, the source is definitely smaller than the destination. and honestly, people don't get _any_ of that right. For one thing, the buffer sizes of the source and destination may be two different things and some #define. It's hard to tell that one is always smaller than the other (or that they are always the same size). So then to get it right in the *general* case, you may need to do something like if (srcsize < dstsize) { .. do the above .. } else { strlcpy(dst,src,dstsize); } do you see the reason? Do you see why strlcpy() is only safe to do when the allocation size of the source buffer is at least as big as the allocation size of the destination buffer? For example, I just grepped for some patterns, and I can find code like this in the kernel name_len = strnlen(fileName, PATH_MAX); name_len++; /* trailing null */ strncpy(pSMB->fileName, fileName, name_len); where pretty much everything is wrong. The comment is fundamentally wrong, and even spells "nul" wrong. Christ. Here's another one: /* will be less than a page size */ len = strnlen(link, ocfs2_fast_symlink_chars(inode->i_sb)); kaddr = kmap_atomic(page); memcpy(kaddr, link, len + 1); and notice how this time at least the comment is (hopefully) correct. But the code is wrong once again, because it doesn't actually do the correct pattern I showed above, it does a "memcpy(len+1)" instead. Bzzt. WRONG! What I think the code *wants* to do is kaddr = kmap_atomic(page); copy_string( // destination and destination size limit kaddr, PAGE_SIZE, // source and source size limit link, ocfs2_fast_symlink_chars(inode->i_sb) ); ie the destination has one size, and the source has another size, and the source may or may not be NUL-terminated. And then the programmer wouldn't have needed the comment, and wouldn't have needed to make sure that yes, ocfs2_fast_symlink_chars() is guaranteed to be less than PAGE_SIZE. Again, the code we actually _have_ in the kernel is not sensible. It doesn't actually nul-terminate the destination, despite clearly _trying_ to (note that "len+1" in the memcpy). Now, it's possible that it doesn't care about properly nul-terminating things. And it's possible; that the source is always nul-terminated to begin with unless the filesystem is corrupted. But the code clearly _tries_ to do something, and fails. Because copying a string is complicated, particularly when the allocations for source and destination may be entirely different. On a happier note, I actually found a correct code case too. Our "kstrndup()" function seems to actually be at a first glance entirely bug-free, and actually takes a limited source string size, and gives you back a nul-terminated destination string of a different size. Of course, that's a simple case, because the size of the destination is something that that function actually controls, so getting it right is actually somewhat simpler. Linus