The last 5 bytes of a filename generated by generate_obfuscated_name() can be selected such that they (along with all of the preceding characters in the filename) produce any desired value when hashed. The effect of these bytes on the hash can be seen visually below: +-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-+ hash: | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | +-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-|-+-+-+-+-+-+-+-+ last-4 ->| |<-- last-2 --->| |<--- last ---->| |<-- last-3 --->| |<-- last-1 --->| |<- last-4 Note byte (last - 4) wraps around. A previous patch in this series eliminated the effect of the upper 4 bits of that byte, effectively forcing them to be all be 0 bits. The final bytes of the generated name are taken directly from portions of a computed hash value. It's possible for those portions to result in characters ('\0' and '/') that are not allowed in valid file names. If this occurs, the existing code abandons the current file name and starts again from the beginning. But there exist cases where this can lead to a never-ending loop. This change modifies the algorithm used so that if a non-allowed character arises, we flip a bit in that character (which results in an allowed character), along with another "matching" bit in another byte such that the resulting hash is unchanged. The two un-allowed characters in a pathname component are '\0' and '/'. Flipping any one bit in either one of those characters converts it to an allowed character. (Flipping two distinct bits also results in an allowed character.) As seen in the diagram above, the effect of each character on the hash overlaps two other characters. We can choose to flip one of the overlapping bits in a "bad" character, then flip the overlapping bit in the next byte, and the result will produce the same hash. So, starting with the first of these last 5 bytes (last-4), if its "normal" value is one of the invalid ones, we flip its low bit and arrange to flip the high bit of its successor byte. The very last byte has a little different treatment. If we flip its low bit, it has no successor, but we can flip the corresponding bit in the first of the 5-byte series (at position 0x10). There is one more case to consider. It's possible in that last case that by flipping a bit in byte (last-4), we have converted that byte to one that's not allowed. Fortunately, we have four bits of overlap here, so we can choose to flip a second bit in both the last byte and (last-4). Since flipping two bits also results in a good character, this resolves the issue. With these changes to the filename generation algorithm, we avoid any of the cases in which no alternate name can be found without using an illegal character. We also avoid all looping, resulting in a fixed/deterministic time to generate the name. Signed-off-by: Alex Elder <aelder@xxxxxxx> --- db/metadump.c | 69 ++++++++++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 53 insertions(+), 16 deletions(-) Index: b/db/metadump.c =================================================================== --- a/db/metadump.c +++ b/db/metadump.c @@ -445,6 +445,8 @@ generate_obfuscated_name( do { dup = 0; for (;;) { + uchar_t high_bit; + /* * The beginning of the obfuscated name can * be pretty much anything, so fill it in @@ -460,25 +462,60 @@ generate_obfuscated_name( * Compute which five bytes need to be used * at the end of the name so the hash of the * obfuscated is the same as the hash of the - * original. + * original. If any result in an invalid + * character, flip a bit and arrange for a + * corresponding bit in a neighboring byte + * to be flipped as well. For the last + * byte, the "neighbor" to change is the + * first byte we're computing here. */ newhash = rol32(newhash, 3) ^ hash; - newp[namelen - 5] = (newhash >> 28) & 0x7f; - if (is_invalid_char(newp[namelen - 5])) - continue; - newp[namelen - 4] = (newhash >> 21) & 0x7f; - if (is_invalid_char(newp[namelen - 4])) - continue; - newp[namelen - 3] = (newhash >> 14) & 0x7f; - if (is_invalid_char(newp[namelen - 3])) - continue; - newp[namelen - 2] = (newhash >> 7) & 0x7f; - if (is_invalid_char(newp[namelen - 2])) - continue; - newp[namelen - 1] = (newhash >> 0) & 0x7f; - if (is_invalid_char(newp[namelen - 1])) - continue; + high_bit = 0; + + newp[namelen - 5] = (newhash >> 28) & 0x7f ^ high_bit; + if (is_invalid_char(newp[namelen - 5])) { + newp[namelen - 5] ^= 1; + high_bit = 0x80; + } else + high_bit = 0; + + newp[namelen - 4] = (newhash >> 21) & 0x7f ^ high_bit; + if (is_invalid_char(newp[namelen - 4])) { + newp[namelen - 4] ^= 1; + high_bit = 0x80; + } else + high_bit = 0; + + newp[namelen - 3] = (newhash >> 14) & 0x7f ^ high_bit; + if (is_invalid_char(newp[namelen - 3])) { + newp[namelen - 3] ^= 1; + high_bit = 0x80; + } else + high_bit = 0; + + newp[namelen - 2] = (newhash >> 7) & 0x7f ^ high_bit; + if (is_invalid_char(newp[namelen - 2])) { + newp[namelen - 2] ^= 1; + high_bit = 0x80; + } else + high_bit = 0; + + newp[namelen - 1] = (newhash >> 0) & 0x7f ^ high_bit; + if (is_invalid_char(newp[namelen - 1])) { + newp[namelen - 1] ^= 1; + high_bit = 0x80; + } else + high_bit = 0; + + if (high_bit) { + newp[namelen - 5] ^= 0x10; + if (is_invalid_char(newp[namelen - 5])) { + newp[namelen - 1] ^= 2; + newp[namelen - 5] ^= 0x20; + ASSERT(!is_invalid_char(newp[namelen - 1])); + } + } break; } _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs