Hi all, Ted told me about some bugs that the ext4 Unicode casefolding code has suffered over the past year -- they tried stripping out zero width joiner (ZWJ) codepoints to try to eliminate casefolded lookup comparison issues, but doing so corrupts compound emoji handling in filenames. XFS of course persists names with byte accuracy (aka it doesn't do casefolding or normalization) so it's not affected by those problems. However, xfs_scrub has the ability to warn about confusing names and other utf8 shenanigans so I decided to expand fstests. I wired up Ted's confusing names into generic/453 in fstests and it promptly crashed when trying to warn about filenames that consist entirely of compound emoji (e.g. heart + zwj + bandaid render as a heart with a bandaid over it). So there's a patch to fix that buffer overflow. There's a second patch to avoid complaining about ZWJ unless it results in confusing names in the same namespace. The third patch fixes a minor reporting problem when parent pointers are enabled. If you're going to start using this code, I strongly recommend pulling from my git trees, which are linked below. This has been running on the djcloud for months with no problems. Enjoy! Comments and questions are, as always, welcome. --D xfsprogs git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-emoji-fixes --- Commits in this patchset: * xfs_scrub: fix buffer overflow in string_escape * xfs_scrub: don't warn about zero width joiner control characters * xfs_scrub: use the display mountpoint for reporting file corruptions --- scrub/common.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++-- scrub/unicrash.c | 10 ++++++++-- 2 files changed, 58 insertions(+), 4 deletions(-)