Re: [PATCH v3, 04/16] xfsprogs: metadump: adjust rather than start over when invalid byte found

Dave Chinner <david@xxxxxxxxxxxxx> · Thu, 24 Feb 2011 12:37:15 +1100



On Fri, Feb 18, 2011 at 03:21:01PM -0600, Alex Elder wrote:
> The last 5 bytes of a name generated by generate_obfuscated_name()
> can be selected such that they (along with all of the preceding
> characters in the name) produce any desired value when hashed.
> 
> They are selected based on how their value affects the outcome of
> the hash calculation for the obfuscated name.  Each byte is XOR'd
> into the hash at a certain position.  The portion of the hash
> affected by each of these last five bytes can be seen visually below
> (where "last-3" means the byte 3 positions before the last byte in
> the name):
> 
> 	+-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-+
> hash:	| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
> 	+-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-+
>        last-4 ->|           |<-- last-2 --->|           |<--- last ---->|
>               |<-- last-3 --->|           |<-- last-1 --->|     |<- last-4
> 
> (Note that byte (last-4) wraps around.  The previous patch in this
> series eliminated the effect of the upper 4 bits of that byte by
> forcing them to be all be 0 bits.)
> 
> Using the (XOR) difference between the hash computed for the
> beginning of the obfuscated name and the hash from the original
> name, we directly determine the required final five bytes to make
> the hashes for the two complete names match.  The lower byte (bits
> 0-7) of that difference is used for the last character in the
> obfuscated name, bits 7-14 for the second-to-last, and so on.
> 
> So we start with the difference between the hash from the complete
> original name and the hash (so far) for a random string constituting
> the first part of the obfuscated name.  We extract five sets of 8
> bits from the result at the positions indicated above, and those
> 8-bit values will become the final five bytes of the obfuscated
> name.  By assuming (or forcing) the top bit of each of these
> extracted values to be 0 (by masking off the top bit), we can ignore
> the overlapping portions when determining the bytes to use.
> 
> 
> It's possible for this process to produce characters ('\0' and '/')
> that are not allowed in valid names.  If this occurs, the existing
> code abandons the current obfuscated name and starts again from the
> beginning.  But there exist cases where this can lead to a
> never-ending loop.
> 
> Arkadiusz MiÅkiewicz encountered just such a name, "R\323\257NE".
> That name produces hash value 0x3a4be740, which requires that the
> obfuscated name uses '/' at position last-2.  The current algorithm
> starts over, but since there are no random characters in this
> length-5 obfuscated name, no other possibility will be found, and
> the process repeats forever.
> 
> 
> This change modifies the algorithm used so that if a unallowed
> character arises, we flip a bit in that character, along with
> another "matching" bit in another (overlapping) character such that
> the resulting hash is unchanged.  The two unallowed characters in a
> name are '\0' (0x00) and '/' (0x2f), and flipping any one bit in
> either of those characters results in an allowed character.
> 
> 
> So, starting with the first of these last 5 bytes (last-4), if its
> "normal" value is one of the unallowed characters, we flip its low
> bit and arrange to flip the high bit of its successor byte.  The
> remaining bytes are treated similarly.
> 
> The very last byte has a little different treatment.  We can flip
> its low bit, but it has no successor byte per se.  Its effect on
> the hash does, however overlap the upper four bits from byte
> (last-4).  We can therefore flip the corresponding bit in that (at
> position 0x10).
> 
> 
> There is one more case to consider.  It's possible in that last
> case that by flipping a bit in byte (last-4), we have converted
> that byte to one that's not allowed.  It turns out this won't ever
> happen, because we know that byte was initially assigned a value
> with its upper four bits clear.  Flipping the bit at 0x10 cannot
> therefore produce either 0x00 or 0x2f, so we don't need to treat
> this case.
> 
> 
> With these changes to the name generation algorithm, we avoid
> any of the cases in which no alternate name can be found without
> using an illegal character.  We also avoid all looping due to bad
> characters.
> 
> Reported-by: Arkadiusz MiÅkiewicz <arekm@xxxxxxxx>
> Signed-off-by: Alex Elder <aelder@xxxxxxx>

Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx>

-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs