Re: [PATCH v2 1/2] commit: reject invalid UTF-8 codepoints

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



brian m. carlson:

+		/* U+FFFE and U+FFFF are guaranteed non-characters. */
+		if ((codepoint & 0x1ffffe) == 0xfffe)
+			return bad_offset;

I missed this the first time around: All Unicode characters whose lower 16-bits are FFFE or FFFF are non-characters, so you can re-write that to:

  /* U+xxFFFE and U+xxFFFF are guaranteed non-characters. */
  if ((codepoint & 0xfffe) == 0xfffe)
   return bad_offset;

Also, the range U+FDD0--U+FDEF are also non-characters, if you wish to be really pedantic.

$ grep '^[0-9A-F].*<not a' NamesList.txt
FDD0	<not a character>
FDD1	<not a character>
FDD2	<not a character>
FDD3	<not a character>
FDD4	<not a character>
FDD5	<not a character>
FDD6	<not a character>
FDD7	<not a character>
FDD8	<not a character>
FDD9	<not a character>
FDDA	<not a character>
FDDB	<not a character>
FDDC	<not a character>
FDDD	<not a character>
FDDE	<not a character>
FDDF	<not a character>
FDE0	<not a character>
FDE1	<not a character>
FDE2	<not a character>
FDE3	<not a character>
FDE4	<not a character>
FDE5	<not a character>
FDE6	<not a character>
FDE7	<not a character>
FDE8	<not a character>
FDE9	<not a character>
FDEA	<not a character>
FDEB	<not a character>
FDEC	<not a character>
FDED	<not a character>
FDEE	<not a character>
FDEF	<not a character>
FFFE	<not a character>
FFFF	<not a character>
1FFFE	<not a character>
1FFFF	<not a character>
2FFFE	<not a character>
2FFFF	<not a character>
3FFFE	<not a character>
3FFFF	<not a character>
4FFFE	<not a character>
4FFFF	<not a character>
5FFFE	<not a character>
5FFFF	<not a character>
6FFFE	<not a character>
6FFFF	<not a character>
7FFFE	<not a character>
7FFFF	<not a character>
8FFFE	<not a character>
8FFFF	<not a character>
9FFFE	<not a character>
9FFFF	<not a character>
AFFFE	<not a character>
AFFFF	<not a character>
BFFFE	<not a character>
BFFFF	<not a character>
CFFFE	<not a character>
CFFFF	<not a character>
DFFFE	<not a character>
DFFFF	<not a character>
EFFFE	<not a character>
EFFFF	<not a character>
FFFFE	<not a character>
FFFFF	<not a character>
10FFFE	<not a character>
10FFFF	<not a character>

--
\\// Peter - http://www.softwolves.pp.se/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]