[PATCH 2/3] xfs_scrub: don't warn about zero width joiner control characters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Darrick J. Wong <djwong@xxxxxxxxxx>

The Unicode code point for "zero width joiners" (aka 0x200D) is used to
hint to renderers that a sequence of simple code points should be
combined into a more complex rendering.  This is how compound emoji such
as "wounded heart" are composed out of "heart" and "bandaid"; and how
complex glyphs are rendered in Malayam.

Emoji in filenames are a supported usecase, so stop warning about the
mere existence of ZWJ.  We already warn about ZWJ that are used to
produce confusingly rendered names in a single namespace, so we're not
losing any robustness here.

Cc: <linux-xfs@xxxxxxxxxxxxxxx> # v6.10.0
Fixes: d43362c78e3e37 ("xfs_scrub: store bad flags with the name entry")
Signed-off-by: "Darrick J. Wong" <djwong@xxxxxxxxxx>
---
 scrub/unicrash.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)


diff --git a/scrub/unicrash.c b/scrub/unicrash.c
index 143060b569f27c..b83bef644b6dce 100644
--- a/scrub/unicrash.c
+++ b/scrub/unicrash.c
@@ -508,8 +508,14 @@ name_entry_examine(
 		if (is_nonrendering(uchr))
 			ret |= UNICRASH_INVISIBLE;
 
-		/* control characters */
-		if (u_iscntrl(uchr))
+		/*
+		 * Warn about control characters in filenames except for zero
+		 * width joiners because those are used to construct compound
+		 * emoji and glyphs in various languages.  ZWJ is already
+		 * covered by UNICRASH_INVISIBLE, so we can detect its use in
+		 * confusing names.
+		 */
+		if (uchr != 0x200D && u_iscntrl(uchr))
 			ret |= UNICRASH_CONTROL_CHAR;
 
 		switch (u_charDirection(uchr)) {





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux