Re: [PATCH v3, 15/16] xfsprogs: metadump: use printable characters for obfuscated names

Dave Chinner <david@xxxxxxxxxxxxx> · Thu, 3 Mar 2011 16:06:14 +1100

On Fri, Feb 25, 2011 at 12:13:56PM -0600, Alex Elder wrote:
> On Thu, 2011-02-24 at 19:45 +1100, Dave Chinner wrote:
> > On Fri, Feb 18, 2011 at 03:21:02PM -0600, Alex Elder wrote:
> > > There is probably not much need for an extreme amount of randomness
> > > in the obfuscated names produced in metadumps.  Limit the character
> > > set used for (most of) these names to printable characters rather
> > > than every permittable byte.  The result makes metadumps a bit more
> > > natural to work with.
> > > 
> > > I chose the set of all upper- and lower-case letters, digits, and
> > > the dash and underscore for the alphabet.  It could easily be
> > > expanded to include others (or reduced for that matter).
> > > 
> > > This change also avoids ever having to retry after picking an
> > > unusable character.
> > > 
> > > Signed-off-by: Alex Elder <aelder@xxxxxxx>
> > > 
> > > No significant changes in this version from the last version posted.
> > > 
> > > ---
> > >  db/metadump.c |    9 ++++-----
> > >  1 file changed, 4 insertions(+), 5 deletions(-)
> > > 
> > > Index: b/db/metadump.c
> > > ===================================================================
> > > --- a/db/metadump.c
> > > +++ b/db/metadump.c
> > > @@ -412,12 +412,11 @@ nametable_add(xfs_dahash_t hash, int nam
> > >  static inline uchar_t
> > >  random_filename_char(void)
> > >  {
> > > -	uchar_t			c;
> > > +	static uchar_t filename_alphabet[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
> > > +						"abcdefghijklmnopqrstuvwxyz"
> > > +						"0123456789-_";
> > >  
> > > -	do {
> > > -		c = random() % 127 + 1;
> > > -	} while (c == '/');
> > > -	return c;
> > > +	return filename_alphabet[random() % (sizeof filename_alphabet - 1)];
> > >  }
> > 
> > Why not just:
> > 
> > 	do {
> > 		c = random() % 127 + 1;
> > 	} while (!isalnum(c));
> > 
> > 	return c;
> > 
> 
> Mainly because I wasn't sure what people would want as an acceptable
> alphabet to select from.  We could just use [a-z], for example, and
> this way that could easily be changed without changing how the
> function worked.  It's also locale-independent (which may or may not
> be good I suppose).

isalnum() allows locale specific characters, so allows a larger
number of potential characters than just the static table you
defined. That was the primary reasonn I suggested it - more random
characters to chose from means less probability of duplicates
occurring....

> Plus as an added bonus, it will never need to compute any
> unnecessary random numbers, thereby saving about 12 CPU
> cycles. :)

I doubt that is likely to be a problem. :)

> I don't really care much, but would lean toward leaving
> it the way I have it.  Do you feel strongly that I should
> change it?  Do you think [a-z] (islower()) would be even
> better?

No, the more random characters there are to chose from the better. I
guess that the table you've defined is plenty to chose from, so in
the absense of any hard numbers, I think your table-based approach
will be fine.

Swings and round-abouts, deck chairs on the Titanic...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs