Re: [PATCH 4/4] reiserfs: rework reiserfs_snprintf to not abuse va_list (as much)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2018-04-09 20:16, jeffm@xxxxxxxx wrote:
> From: Jeff Mahoney <jeffm@xxxxxxxx>
> 
> reiserfs_snprintf operates by handling the reiserfs-specific specifiers
> itself and passing everything else to the generic vsnprintf routine.
> That makes some assumptions about what the state of a va_list passed
> by value to another function will be after the function returns.  The
> spec states that the contents of the va_list are undefined in that
> case.  It's worked so far but really only by the grace of gcc's internal
> va_list implementation details.
> 

No, it has _not_ worked so far, at least not on all architectures,
specifically those that always pass all varargs on the stack (e.g. 32
bit x86) and has va_list being a simple pointer. If I'm wrong about

	reiserfs_warning(NULL, "reiserfs-16100",
			 "STATDATA, index %d, type 0x%x, %h",
			 vi->vi_index, vi->vi_type, vi->vi_ih);

ending up interpreting vi->vi_index as a struct item_head* (instead of
using vi->vi_ih as intended), and thus crashing once that gets
derefenced inside sprintf_item_head, please tell me why. As far as I can
tell, prepare_error_buf (or its new incarnation) starts by passing

  "STATDATA, index %d, type 0x%x, "

to vs(n)printf along with the current args, that correctly picks out an
int and an unsigned from the list and formats that, but when va_list is
not effectively passed by reference, our copy of args doesn't change, so
when we then do

  va_arg(args, struct item_head *)

to handle the %h, we'll be fetching the first argument (vi->vi_index)
again, and dereference a completely bogus value.


> This patch reworks reiserfs_snprintf to process one % directive at
> a time.  Since directives can consume more than one argument when field
> width and precision * operators are used, that means we need to interpret
> the format string more than we used to do.  (The kernel vsnprintf
> specifically doesn't handle %n). Otherwise, we can use an unsigned long
> to hold the argument and let the generic snprintf do the type handling.

hmmm...

> The only exception is long long arguments on platforms where long long
> is larger than long, which need special handling.
> 
> The result is a va_list that is consistent within reiserfs_snprintf
> and isn't passed by value anymore.
> 
> This isn't an ideal solution, but I didn't feel that teaching pointer()
> about a %pV variant that didn't use va_copy would get much traction for
> a number of reasons.

One of those reasons being that it wouldn't solve your problem anyway:
Sure, we can implement %pVn that avoids the copy. So we'd have this in
vsprintf.c:

	case 'V':
		if (fmt[1] == 'n') {
			buf += vsnprintf(buf, end > buf ? end - buf : 0,
				((struct va_format *)ptr)->fmt,
				*((struct va_format *)ptr)->va);
		} else {
			va_list va;

			va_copy(va, *((struct va_format *)ptr)->va);
			buf += vsnprintf(buf, end > buf ? end - buf : 0,
					 ((struct va_format *)ptr)->fmt, va);
			va_end(va);
		}
		return buf;

See how the va_list is not actually passed by reference to the recursive
vsnprintf? The fact is, with va_list being a simple pointer type, there
is absolutely no way of knowing how much a function taking a va_list as
parameter has consumed.


> Reported-by: Rasmus Villemoes <linux@xxxxxxxxxxxxxxxxxx>
> Signed-off-by: Jeff Mahoney <jeffm@xxxxxxxx>
> ---
>  fs/reiserfs/prints.c | 158 +++++++++++++++++++++++++++++++++++++++++----------
>  1 file changed, 128 insertions(+), 30 deletions(-)
> 
> diff --git a/fs/reiserfs/prints.c b/fs/reiserfs/prints.c
> index aead27d11e45..03ebdea9ddc7 100644
> --- a/fs/reiserfs/prints.c
> +++ b/fs/reiserfs/prints.c
> @@ -6,6 +6,7 @@
>  #include <linux/fs.h>
>  #include "reiserfs.h"
>  #include <linux/string.h>
> +#include <linux/ctype.h>
>  #include <linux/buffer_head.h>
>  
>  #include <stdarg.h>
> @@ -195,19 +196,94 @@ static size_t snprintf_disk_child(char *buf, size_t size, struct disk_child *dc)
>  			dc_block_number(dc), dc_size(dc));
>  }
>  
> -static char *is_there_reiserfs_struct(char *fmt, int *what)
> -{
> -	char *k = fmt;
> -
> -	while ((k = strchr(k, '%')) != NULL) {
> -		if (k[1] == 'k' || k[1] == 'K' || k[1] == 'h' || k[1] == 't' ||
> -		    k[1] == 'z' || k[1] == 'b' || k[1] == 'y' || k[1] == 'a') {
> -			*what = k[1];
> +/*
> + * This should all just be %pV but pointer() does a va_copy and uses that
> + * instead of directly using args.  That's the right thing to do pretty
> + * much every time, but it still forces us to identify how many arguments
> + * we need to pass for a single format spec.  We need to be careful of u64
> + * on 32-bit since it'll need special handling.  We can just use an
> + * unsigned long for everything else and vsnprintf will handling the
> + * typing itself.
> + */
> +static int handle_generic_snprintf(char *p, size_t left,
> +				   const char *fmt, va_list *args)
> +{
> +	int width, precision, nargs = 1;
> +	const char *spec = fmt;
> +	unsigned long val;
> +	int len = 0;
> +
> +	/* Skip flags */
> +	while (1) {
> +		bool found = true;
> +		++spec;
> +		switch (*spec) {
> +		case '-':
> +		case '+':
> +		case ' ':
> +		case '#':
> +		case '0':
>  			break;
> +		default:
> +			found = false;
>  		}
> -		k++;
> +		if (!found)
> +			break;
> +	}
> +
> +	/* Field width */
> +	if (isdigit(*spec)) {
> +		while (isdigit(*++spec));
> +	} else if (*spec == '*') {
> +		nargs++;
> +		spec++;
> +	}
> +
> +	/* Precision */
> +	if (*spec == '.') {
> +		spec++;
> +		if (isdigit(*spec)) {
> +			while (isdigit(*++spec));
> +		} else if (*spec == '*') {
> +			nargs++;
> +			spec++;
> +		}
> +	}
> +
> +	if (*spec == 'L' || !strncmp(spec, "ll", 2)) {
> +		unsigned long long ullval;
> +		if (nargs == 3) {
> +			width = va_arg(*args, int);
> +			precision = va_arg(*args, int);
> +			ullval = va_arg(*args, unsigned long long);
> +			len = snprintf(p, left, fmt, width, precision,
> +				       ullval);
> +		} else if (nargs == 2) {
> +			width = va_arg(*args, int);
> +			ullval = va_arg(*args, unsigned long long);
> +			len = snprintf(p, left, fmt, width, ullval);
> +		} else {
> +			ullval = va_arg(*args, unsigned long long);
> +			len = snprintf(p, left, fmt, ullval);
> +		}
> +		return len;
> +	}
> +
> +	if (nargs == 3) {
> +		width = va_arg(*args, int);
> +		precision = va_arg(*args, int);
> +		val = va_arg(*args, unsigned long);
> +		len = snprintf(p, left, fmt, width, precision, val);
> +	} else if (nargs == 2) {
> +		width = va_arg(*args, int);
> +		val = va_arg(*args, unsigned long);
> +		len = snprintf(p, left, fmt, width, val);
> +	} else {
> +		val = va_arg(*args, unsigned long);
> +		len = snprintf(p, left, fmt, val);
>  	}


Urgh, how certain are you that this actually works on all architectures?
In particular, on a 64 bit architecture, we could easily fetch too much,
right? (and thus ruin all following fetches)? Not to mention that the
ABI might have different rules for passing pointers versus integers, or
sign-extension or 32-bit values into 64 bit slots, or endianness issues,
or... If all that works out fine, at least this deserves a comment. And
some documentation references...

Aside from that, duplicating the format string parsing is plain ugly and
error-prone.

===

Here's an idea that does require rewriting all the existing
reiserfs_warning etc. calls, but avoids all the fragile vararg handling
and adds proper compile-time format string checking:

Have reiserfs_warning (and friends) be a macro that expands to

  reiserfs_printf_start();
  __reiserfs_warning(the arguments);
  reiserfs_printf_stop();

where printf_start would take a spinlock and reset some bookkeeping
variables. Then all the sprintf_de_head etc. helpers would have a
prototype like

const char *sprintf_de_head(struct reiserfs_de_head *deh)

and they would print their output to the current error_buf, updating the
above-mentioned bookkeeping variable(s) and return a pointer to the
start of the area they printed to. So

  reiserfs_warning("bla %a", deh)

would get changed to

  reiserfs_warning("bla %s", sprintf_de_head(deh))

Basically, the current error_buf would only be used for holding the
result of the custom specifiers, everything else goes directly to
printk's output buffer, so no need to make error_buf larger (a single
printk is limited to 1024 bytes anyway...), and fmt_buf goes away
completely. That reduction in static footprint should make up for at
least some of the bloat caused by the extra code the new
reiserfs_warning would generate. Also, I don't think this should make us
hold the spinlock (much) longer than is currently the case, and
trivially solves the problem of not actually holding the spinlock when
passing the error_buf to printk.

Hm?

Rasmus
--
To unsubscribe from this list: send the line "unsubscribe reiserfs-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux File System Development]     [Linux BTRFS]     [Linux NFS]     [Linux Filesystems]     [Ext4 Filesystem]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Resources]

  Powered by Linux