Re: [PATCH 4/4] reiserfs: rework reiserfs_snprintf to not abuse va_list (as much)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/10/18 11:34 AM, Jan Kara wrote:
> On Tue 10-04-18 10:44:43, Jeff Mahoney wrote:
>> On 4/10/18 10:25 AM, Jan Kara wrote:
>>> On Mon 09-04-18 14:16:02, jeffm@xxxxxxxx wrote:
>>>> From: Jeff Mahoney <jeffm@xxxxxxxx>
>>>>
>>>> reiserfs_snprintf operates by handling the reiserfs-specific specifiers
>>>> itself and passing everything else to the generic vsnprintf routine.
>>>> That makes some assumptions about what the state of a va_list passed
>>>> by value to another function will be after the function returns.  The
>>>> spec states that the contents of the va_list are undefined in that
>>>> case.  It's worked so far but really only by the grace of gcc's internal
>>>> va_list implementation details.
>>>>
>>>> This patch reworks reiserfs_snprintf to process one % directive at
>>>> a time.  Since directives can consume more than one argument when field
>>>> width and precision * operators are used, that means we need to interpret
>>>> the format string more than we used to do.  (The kernel vsnprintf
>>>> specifically doesn't handle %n). Otherwise, we can use an unsigned long
>>>> to hold the argument and let the generic snprintf do the type handling.
>>>> The only exception is long long arguments on platforms where long long
>>>> is larger than long, which need special handling.
>>>
>>> So I wonder why is this correct for architectures where sizeof(int) !=
>>> sizeof(long). If you do va_arg(va, long) for an argument which is actually
>>> int, you'll consume more bytes from the stack than you should. Or am I
>>> missing some va_arg() subtlety?
>>
>> It's possible I'm incorrect here.  I'm no expert on how variadics are
>> constructed across architectures.  My underlying assumption is that even
>> if the arguments are passed on the stack, they're not the same as stack
>> variables.  There's no ABI between caller and callee, so I expect that
>> the stack slot would be the size of a register, which is long.  So the
>> int returned by va_arg would only take up e.g. 4 bytes on the stack but
>> the space va_arg uses would take e.g. 8 bytes.  Yes, it's fragile.
> 
> Well, another equally sensible implementation could be that the compiler
> just arranges for storing all arguments on stack using their native size by
> the caller (who knows all the types) and va_start + va_arg know how to
> consume these arguments from the stack. And with such implementation you'd
> need to make sure you pass a proper type to va_arg()...

I suppose.  I'd like to avoid doing that entirely.  I'll give the
non-va_copy pointer() implementation a try.

>>>> The result is a va_list that is consistent within reiserfs_snprintf
>>>> and isn't passed by value anymore.
>>>>
>>>> This isn't an ideal solution, but I didn't feel that teaching pointer()
>>>> about a %pV variant that didn't use va_copy would get much traction for
>>>> a number of reasons.
>>>
>>> Frankly, I don't like the format string parsing duplication in your patch
>>> and I'm afraid it will be a maintenance issue. Can't we have a variant
>>> of vsnprintf() that would take a pointer to va_list (like you use in
>>> your patch for some functions) so that its contents is well defined after
>>> the function returns?
>>
>> That would certainly solve the problem, but I didn't go that route
>> because the interface would end up being pretty fragile.  For any case
>> other than consuming a single (potentially compound) format unit, it'd
>> be pretty easy to get out of sync.
> 
> How come? You'd pass the format string containing only non-reiserfs
> specifiers and va_list * to new_vsnprintf(). It would consume as many args
> as it should. Then you process next reiserfs specifier consuming some arg,
> etc. I don't see what can get out of sync there...

I meant more as a general API.  I have code for reiserfs ready to go to
do this, though.

-Jeff

>> What I'd really like to see is an extended pointer() that allows the
>> caller to provide a callback or similar.  ReiserFS isn't alone in
>> wanting to consistently print its internal structures.  That would also
>> eliminate the racy format and error buffer in reiserfs as well.  I've
>> been operating under the impression that this route would be a big
>> uphill battle.
> 
> Well, yes, that would be a nicer solution but I don't expect that to go
> easily either (if nothing else due to bikeshedding).
> 
> 								Honza
> 
>>>> Reported-by: Rasmus Villemoes <linux@xxxxxxxxxxxxxxxxxx>
>>>> Signed-off-by: Jeff Mahoney <jeffm@xxxxxxxx>
>>>> ---
>>>>  fs/reiserfs/prints.c | 158 +++++++++++++++++++++++++++++++++++++++++----------
>>>>  1 file changed, 128 insertions(+), 30 deletions(-)
>>>>
>>>> diff --git a/fs/reiserfs/prints.c b/fs/reiserfs/prints.c
>>>> index aead27d11e45..03ebdea9ddc7 100644
>>>> --- a/fs/reiserfs/prints.c
>>>> +++ b/fs/reiserfs/prints.c
>>>> @@ -6,6 +6,7 @@
>>>>  #include <linux/fs.h>
>>>>  #include "reiserfs.h"
>>>>  #include <linux/string.h>
>>>> +#include <linux/ctype.h>
>>>>  #include <linux/buffer_head.h>
>>>>  
>>>>  #include <stdarg.h>
>>>> @@ -195,19 +196,94 @@ static size_t snprintf_disk_child(char *buf, size_t size, struct disk_child *dc)
>>>>  			dc_block_number(dc), dc_size(dc));
>>>>  }
>>>>  
>>>> -static char *is_there_reiserfs_struct(char *fmt, int *what)
>>>> -{
>>>> -	char *k = fmt;
>>>> -
>>>> -	while ((k = strchr(k, '%')) != NULL) {
>>>> -		if (k[1] == 'k' || k[1] == 'K' || k[1] == 'h' || k[1] == 't' ||
>>>> -		    k[1] == 'z' || k[1] == 'b' || k[1] == 'y' || k[1] == 'a') {
>>>> -			*what = k[1];
>>>> +/*
>>>> + * This should all just be %pV but pointer() does a va_copy and uses that
>>>> + * instead of directly using args.  That's the right thing to do pretty
>>>> + * much every time, but it still forces us to identify how many arguments
>>>> + * we need to pass for a single format spec.  We need to be careful of u64
>>>> + * on 32-bit since it'll need special handling.  We can just use an
>>>> + * unsigned long for everything else and vsnprintf will handling the
>>>> + * typing itself.
>>>> + */
>>>> +static int handle_generic_snprintf(char *p, size_t left,
>>>> +				   const char *fmt, va_list *args)
>>>> +{
>>>> +	int width, precision, nargs = 1;
>>>> +	const char *spec = fmt;
>>>> +	unsigned long val;
>>>> +	int len = 0;
>>>> +
>>>> +	/* Skip flags */
>>>> +	while (1) {
>>>> +		bool found = true;
>>>> +		++spec;
>>>> +		switch (*spec) {
>>>> +		case '-':
>>>> +		case '+':
>>>> +		case ' ':
>>>> +		case '#':
>>>> +		case '0':
>>>>  			break;
>>>> +		default:
>>>> +			found = false;
>>>>  		}
>>>> -		k++;
>>>> +		if (!found)
>>>> +			break;
>>>> +	}
>>>> +
>>>> +	/* Field width */
>>>> +	if (isdigit(*spec)) {
>>>> +		while (isdigit(*++spec));
>>>> +	} else if (*spec == '*') {
>>>> +		nargs++;
>>>> +		spec++;
>>>> +	}
>>>> +
>>>> +	/* Precision */
>>>> +	if (*spec == '.') {
>>>> +		spec++;
>>>> +		if (isdigit(*spec)) {
>>>> +			while (isdigit(*++spec));
>>>> +		} else if (*spec == '*') {
>>>> +			nargs++;
>>>> +			spec++;
>>>> +		}
>>>> +	}
>>>> +
>>>> +	if (*spec == 'L' || !strncmp(spec, "ll", 2)) {
>>>> +		unsigned long long ullval;
>>>> +		if (nargs == 3) {
>>>> +			width = va_arg(*args, int);
>>>> +			precision = va_arg(*args, int);
>>>> +			ullval = va_arg(*args, unsigned long long);
>>>> +			len = snprintf(p, left, fmt, width, precision,
>>>> +				       ullval);
>>>> +		} else if (nargs == 2) {
>>>> +			width = va_arg(*args, int);
>>>> +			ullval = va_arg(*args, unsigned long long);
>>>> +			len = snprintf(p, left, fmt, width, ullval);
>>>> +		} else {
>>>> +			ullval = va_arg(*args, unsigned long long);
>>>> +			len = snprintf(p, left, fmt, ullval);
>>>> +		}
>>>> +		return len;
>>>> +	}
>>>> +
>>>> +	if (nargs == 3) {
>>>> +		width = va_arg(*args, int);
>>>> +		precision = va_arg(*args, int);
>>>> +		val = va_arg(*args, unsigned long);
>>>> +		len = snprintf(p, left, fmt, width, precision, val);
>>>> +	} else if (nargs == 2) {
>>>> +		width = va_arg(*args, int);
>>>> +		val = va_arg(*args, unsigned long);
>>>> +		len = snprintf(p, left, fmt, width, val);
>>>> +	} else {
>>>> +		val = va_arg(*args, unsigned long);
>>>> +		len = snprintf(p, left, fmt, val);
>>>>  	}
>>>> -	return k;
>>>> +
>>>> +	return len;
>>>>  }
>>>>  
>>>>  /*
>>>> @@ -237,23 +313,34 @@ static char fmt_buf[1024];
>>>>  static int reiserfs_snprintf(char *p, size_t left, const char *fmt,
>>>>  			     va_list *args)
>>>>  {
>>>> +	size_t fmtlen = strlen(fmt);
>>>> +	const char *fmtend = fmt + fmtlen;
>>>> +	const char *end;
>>>>  	char *start = p;
>>>> -	char *fmt1 = fmt_buf;
>>>> -	char *k;
>>>> -	int what;
>>>> -	size_t len;
>>>>  
>>>> -	spin_lock(&fmt_lock);
>>>> -	strncpy(fmt1, fmt, sizeof(fmt_buf));
>>>> +	while (left && fmt != NULL && *fmt)  {
>>>> +		bool reiserfs_spec = true;
>>>> +		int fmt_size;
>>>> +		size_t len;
>>>>  
>>>> -	while ((k = is_there_reiserfs_struct(fmt1, &what)) != NULL) {
>>>> -		*k = 0;
>>>> +		end = strchr(fmt + 1, '%');
>>>>  
>>>> -		len = vsnprintf(p, left, fmt1, *args);
>>>> -		p += len;
>>>> -		left -= len;
>>>> +		/* Skip over %% */
>>>> +		while (end && !strncmp(end, "%%", 2))
>>>> +			end = strchr(end + 2, '%');
>>>> +
>>>> +		if (!end)
>>>> +			end = fmtend;
>>>>  
>>>> -		switch (what) {
>>>> +		fmt_size = end - fmt;
>>>> +
>>>> +		/* Any text before the first % - could be entire string */
>>>> +		if (fmt[0] != '%') {
>>>> +			len = snprintf(p, left, "%.*s", fmt_size, fmt);
>>>> +			goto next;
>>>> +		}
>>>> +
>>>> +		switch (fmt[1]) {
>>>>  		case 'k':
>>>>  			len = snprintf_le_key(p, left,
>>>>  				     va_arg(*args, struct reiserfs_key *));
>>>> @@ -286,19 +373,30 @@ static int reiserfs_snprintf(char *p, size_t left, const char *fmt,
>>>>  			len = snprintf_de_head(p, left,
>>>>  				    va_arg(*args, struct reiserfs_de_head *));
>>>>  			break;
>>>> -		}
>>>> +		default:
>>>> +			spin_lock(&fmt_lock);
>>>> +			strncpy(fmt_buf, fmt, fmt_size);
>>>> +			fmt_buf[fmt_size] = 0;
>>>>  
>>>> +			len = handle_generic_snprintf(p, left, fmt_buf, args);
>>>> +			spin_unlock(&fmt_lock);
>>>> +
>>>> +			reiserfs_spec = false;
>>>> +			break;
>>>> +		};
>>>> +
>>>> +		if (reiserfs_spec) {
>>>> +			p += len;
>>>> +			left -= len;
>>>> +
>>>> +			len = snprintf(p, left, "%.*s", fmt_size - 2, fmt + 2);
>>>> +		}
>>>> +next:
>>>>  		p += len;
>>>>  		left -= len;
>>>> -		fmt1 = k + 2;
>>>> +		fmt = end;
>>>>  	}
>>>>  
>>>> -	len = vsnprintf(p, left, fmt1, *args);
>>>> -	p += len;
>>>> -	left -= len;
>>>> -
>>>> -	spin_unlock(&fmt_lock);
>>>> -
>>>>  	return p - start;
>>>>  }
>>>>  
>>>> -- 
>>>> 2.12.3
>>>>
>>
>>
>> -- 
>> Jeff Mahoney
>> SUSE Labs


-- 
Jeff Mahoney
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe reiserfs-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux File System Development]     [Linux BTRFS]     [Linux NFS]     [Linux Filesystems]     [Ext4 Filesystem]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Resources]

  Powered by Linux