On Fri, Apr 05, 2024 at 03:10:33PM +0200, René Scharfe wrote: > >>> memcpy(msg, prefix, prefix_len); > >>> p = msg + prefix_len; > >>> - if (vsnprintf(p, pend - p, err, params) < 0) > >>> + if (vsnprintf(p, pend - p, err, params) < 0) { > >>> + if (snprintf(p, pend - p, "could not format error: %s", err) < 0) > >>> *p = '\0'; /* vsnprintf() failed, clip at prefix */ > >>> + } > >>> > >>> for (; p != pend - 1 && *p; p++) { > >>> if (iscntrl(*p) && *p != '\t' && *p != '\n') > >>> > >>> Though most of the rest of the function is not that useful (it is mostly > >>> removing unreadable chars, and hopefully we do not have any of those in > >>> our format strings!). > > Hmm, this might be worth doing to avoid messing up the terminal if > 'err' points into the weeds for some reason. I think we have bigger problems if "err" is messed up, because we'll be interpreting garbage as a format string and almost certainly triggering undefined behavior. And in general if we are not using a constant string as the format it's a potential security vulnerability. So I think we can mostly rely on the compile-time -Wformat checks for this. > OK, right, a format error doesn't have to be fatal and we can keep > running and possibly provide more details. > > But mixing the format error with the actual payload message is not ideal. > At least we should give the format error its proper prefix, while still > reporting the prefix of the original message, e.g. like this: > > error: unable to format message: unable to open loose object %s > fatal: > > ... or this: > > error: unable to format message: fatal: unable to open loose object %s > > I tend to like the first one slightly better, even though the empty > message looks silly. It doesn't mix the two and answers the questions > that I would have: Why did the program stop? Due to a fatal error. > Why is the fatal message silent? The preceding error explains it. > > While the second one only reveals the fatality somewhere in the middle > of the text, weakening the meaning of prefixes. Yeah, I think your first one is good. It's _ugly_, but it's easy to figure out what happened, and this is not something people generally see (and the status quo is just "fatal:", so you're easily 50% less ugly). > I still don't know why the error code varies between runs, but it > clearly does not come from vsnprintf(3) -- if I set errno to some > arbitrary value before the call, it is kept. Which is enough to > convince me to ignore errno here. Agreed. Thanks for digging into it. -Peff