Re: [PATCH v1 1/2] vsprintf: introduce %dE for error constants

Uwe Kleine-König <uwe@xxxxxxxxxxxxxxxxx> · Sun, 25 Aug 2019 11:14:42 +0200

Hello Andrew,

On Sat, Aug 24, 2019 at 04:58:29PM -0700, Andrew Morton wrote:
> (cc printk maintainers).

Ah, I wasn't aware there is something like them. Thanks

> On Sun, 25 Aug 2019 01:37:23 +0200 Uwe Kleine-König <uwe@xxxxxxxxxxxxxxxxx> wrote:
> 
> > 	pr_info("probing failed (%dE)\n", ret);
> > 
> > expands to
> > 
> > 	probing failed (EIO)
> > 
> > if ret holds -EIO (or EIO). This introduces an array of error codes. If
> > the error code is missing, %dE falls back to %d and so prints the plain
> > number.
> 
> Huh.  I'm surprised we don't already have this.  Seems that this will
> be applicable in a lot of places?  Although we shouldn't go blindly
> converting everything in sight - that would risk breaking userspace
> which parses kernel strings.

Uah, even the kernel log is API? But I agree so far that this is merge
window material and shouldn't make it into v5.3 :-)

> Is it really necessary to handle the positive errnos?  Does much kernel
> code actually do that (apart from kernel code which is buggy)?

I didn't check; probably not. But the whole positive range seems so
unused and interpreting EIO (and not only -EIO) as "EIO" seems straight
forward. But I don't feel strong either way.

> > Signed-off-by: Uwe Kleine-König <uwe@xxxxxxxxxxxxxxxxx>
> > ---
> > Hello
> > 
> > there are many code sites that benefit from this. Just grep for
> > "(%d)" ...
> 
> Yup.  This observation shouldn't be below the "^---$" ;) An approximate
> grep|wc would be interesting.

I didn't check how many false positives there are using "(%d)", but I'd
use an a bit more elaborate expression for the commit log:

	$ git grep '(%d)' | wc -l
	7336
	$ git grep -E '^\s*(printk|(kv?as|sn?|vs(c?n)?)printf|(kvm|dev|pr)_(emerg|alert|crit|err|warn(ing)?|notice|info|cont|devel|debug|dbg))\(.*(\(%d\)|: %d)\\n' | wc -l
	9140

The latter matches several printk-variants emitting a string ending in
one of

	'(%d)\n' (1839 matches)
	': %d\n' (7301 matches)

. I would expect that many of the 7336 matches for '(%d)' that are not
matched by the longer expression are false negatives because the
function name is in the previous line. So I would estimate around 10k
strings that could benefit from %dE.

> > --- a/lib/vsprintf.c
> > +++ b/lib/vsprintf.c
> > @@ -533,6 +533,192 @@ char *number(char *buf, char *end, unsigned long long num,
> >  	return buf;
> >  }
> >  
> > +#define ERRORCODE(x) { .str = #x, .err = x }
> > +
> > +static const struct {
> > +	const char *str;
> > +	int err;
> > +} errorcodes[] = {
> 
> It's a bit of a hack, but an array of char*'s and a separate array of
> ushorts would save a bit of space.

Hmm, true. Currently we have 150 entries taking 150 * (sizeof(void *) *
2). Is it worth to think about the hacky solution to go down to 150 *
(sizeof(void *) + 2)?

For an ARM build bloat-o-meter reports (comparing linus/master to linus/master
+ my patch):

	add/remove: 2/0 grow/shrink: 4/2 up/down: 1488/-8 (1480)
	Function                                     old     new   delta
	errorcodes                                     -    1200   +1200
	errstr                                         -     200    +200
	vsnprintf                                    884     960     +76
	set_precision                                148     152      +4
	resource_string                             1380    1384      +4
	flags_string                                 400     404      +4
	num_to_str                                   288     284      -4
	format_decode                               1024    1020      -4
	Total: Before=21686, After=23166, chg +6.82%

But that doesn't seem to include the size increase for all the added
strings which seems to be around another 1300 bytes.

> > +	ERRORCODE(EPERM),
> > +	ERRORCODE(ENOENT),
> > +	ERRORCODE(ESRCH),
> >
> > ...
> >
> > +static noinline_for_stack
> 
> Why this?  I'm suspecting this will actually increase stack use?

I don't know what it does, just copied it from number() which is used
similarly.

> > +char *errstr(char *buf, char *end, unsigned long long num,
> > +	     struct printf_spec spec)
> > +{
> > +	char *errname = NULL;

I missed a warning during my tests, there is a const missing in this
line.

> > +	size_t errnamelen, copy;
> > +	int i;
> > +
> > +	for (i = 0; i < ARRAY_SIZE(errorcodes); ++i) {
> > +		if (num == errorcodes[i].err || num == -errorcodes[i].err) {
> > +			errname = errorcodes[i].str;
> > +			break;
> > +		}
> > +	}
> > +
> > +	if (!errname) {
> > +		/* fall back to ordinary number */
> > +		return number(buf, end, num, spec);
> > +	}
> > +
> > +	copy = errnamelen = strlen(errname);
> > +	if (copy > end - buf)
> > +		copy = end - buf;
> > +	buf = memcpy(buf, errname, copy);
> > +
> > +	return buf + errnamelen;
> > +}
> 
> OK, I guess `errstr' is an OK name for a static function

IMHO the name is very generic (which is bad), but it is in good company,
as there is also pointer() and number().

> and we can use this to add a new strerror() should the need arise.

In userspace the purpose of strerror is different. It would yield "I/O
error" not "EIO". So strerror using this array would only be a "strerror
light".

In my first prototype I even used %m instead of %dE, but as %m (in
glibc) doesn't consume an argument and produces the more verbose
variant, I changed my mind and went for %dE. (Also my patch has the
undocumented side effect that you can use %ldE if the error is held by a
long int. I didn't test that though.) 

Best regards
Uwe
Attachment:
signature.asc

Description: PGP signature