On 12/11/2023 10:59, Alejandro Colomar wrote: > Hi Jonny, > > On Sun, Nov 12, 2023 at 09:52:20AM +0000, Jonny Grant wrote: > [... some micro-benchmarks...] > >> >> Maybe we're gonna need a bigger benchmark. > > Not really. > >> >> Probably there existing studies. Or could patch something like SQLite >> Benchmark to utilise each string function just for measurements. >> Hopefully it moves around at least 2GB of strings to give some >> meaningful comparison timings. > > I wasn't so interested in the small differences between functions. > What this micro-benchmark showed clearly, without needing much more info > to be conclusive, is the first order of growth of each of the functions: > > - strlcpy(3)'s first order growth corresponds to strlen(src). That's > due to returning strlen(src), which proves to be a poor API. > > - strncpy(3)'s first order growth corresponds to sizeof(dst). That's > of course due to the zeroing. If sizeof(dst) is kept very small, you > could live with it. When the size grows to more or less 4 KiB, this > drag becomes meaningful. > > - strnlen(3)+*cpy() first order growth corresponds to > strnlen(src, sizeof(dst)), which is the fastest order of growth > you can get from a truncating string-copying function (except if you > keep track of your slen manually and call directly memcpy(3)). That's a really good point, keeping track of the length (and buffer size) and then just using memcpy. The copy time should be closer to the number of bytes read and written. > > Of course, first order of growth ignores second order of growth and so > on, which for small inputs can be important. That is, O(x^3) is bigger > than O(x^2), but x3 + x2 can be smaller than 5*x2 for small x. > >> >> As Paul mentioned, strlcpy is a poor choice for processing strings.\ >> Could rely on their guidance as they already measured. >> https://www.gnu.org/software/libc/manual/html_node/Truncating-Strings.html > > Indeed. I've added important notices in BUGS about it, and recommended > against Saw glibc have (11) functions listed as a poor choice for string processing > >> >> Maybe the strlcpy API is easier, safer for programmers; but the >> compiler can't figure out that the programmer already knew src string >> length. So the strlcpy does a strlen() and wastes time reading over >> memory. If the src length is known, can just memcpy. > > I've written strtcpy(3) as an alternative to strlcpy(3) that doesn't > suffer its problems. It should be even safer and easier to use, and its > first order of growth is better. I'll send a patch for review in a > moment. I did take a look at strtcpy but it calls strnlen(), reading over memory. > >> When I've benchmarked things, reducing the memory accesses for read, >> write boosted performance, also looked at the cycles taken, of course >> cache and alignment all play a part too. > > If one wants to micro-optimize for their use case, its none of my > business. I provide a function that should be safe and relatively fast > for all use cases, which libc doesn't. > >> Maybe could suggest in your man page programmers should keep track of >> the src size ? - to save the cost of the strlen(). > > No. Optimizations are not my business. Writing good APIs should make > these optimizations low value so that they aren't done, except for the > most performance-critical programs. > > The problem comes when libc doesn't provide anything usable, and the > user has no guidance on where to start. Then, programmers start being > clever, usually too clever. That's why I think the man-pages should go > ahead and write wrapper functions such as strtcpy() and stpecpy() > aound libc functions; these wrappers should provide a fast and safe > starting point for most programs. > > It's true that memcpy(3) is the fastest function one can use, but it > requires the programmer to be rather careful with the lengths of the > strings. I don't think keeping track of all those little details is > what the common programmer should do. That's true, high-performance users probably create their own bespoke solutions. strtcpy probably takes the src size? > >> >> At least the strlen functions are optimized: >> glibc/strnlen.c calls memchr() searching for '\0' memchr searches 4 bytes at a time. >> glibc/strlen.c searches 4 bytes at a time. >> >> glibc/strlcpy.c __strlcpy() is there a reason when truncating it overwrites the last byte, twice? >> >> memcpy (dest, src, size); >> dest[size - 1] = '\0'; > > -1's in the source code make up for off-by-one bugs. APIs should be > written so that common use doesn't involve manually writing -1 if > possible. What way do you feel they should be doing it? > > I acknowledge the performance benefits of this construction, and have > used it myself in NGINX code, but I also find it very dangerous, which > is why I recommend using a wrapper over it: > > char * > ustr2stp(char *restrict dst, const char *restrict src, size_t len) > { > char *p; > > p = mempcpy(dst, src, len); > *p = '\0'; > > return p; > } > > Cheers, > Alex > >> >> Kind regards, Jonny > Kind regards, Jonny