Hi all, Last week, I was fiddling around with the metadump name obfuscation code while writing a debugger command to generate directories full of names that all have the same hash name. I had a few questions about how well all that worked with ascii-ci mode, and discovered a nasty discrepancy between the kernel and glibc's implementations of the tolower() function. I discovered that I could create a directory that is large enough to require separate leaf index blocks. The hashes stored in the dabtree use the ascii-ci specific hash function, which uses a library function to convert the name to lowercase before hashing. If the kernel and C library's versions of tolower do not behave exactly identically, xfs_ascii_ci_hashname will not produce the same results for the same inputs. xfs_repair will deem the leaf information corrupt and rebuild the directory. After that, lookups in the kernel will fail because the hash index doesn't work. The kernel's tolower function will convert extended ascii uppercase letters (e.g. A-with-umlaut) to extended ascii lowercase letters (e.g. a-with-umlaut), whereas glibc's will only do that if you force LANG to ascii. Tiny embedded libc implementations just plain won't do it at all, and the result is a mess. Stabilize the behavior of the hash function by encoding the kernel's tolower function in libxfs, add it to the selftest, and fix xfs_scrub not handling this correctly. If you're going to start using this mess, you probably ought to just pull from my git trees, which are linked below. This is an extraordinary way to destroy everything. Enjoy! Comments and questions are, as always, welcome. --D kernel git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=fix-asciici-tolower-6.3 --- fs/xfs/libxfs/xfs_dir2.c | 4 - fs/xfs/libxfs/xfs_dir2.h | 20 ++++ fs/xfs/scrub/dir.c | 7 +- fs/xfs/xfs_dahash_test.c | 211 ++++++++++++++++++++++++---------------------- 4 files changed, 139 insertions(+), 103 deletions(-)