SZEDER Gábor <szeder.dev@xxxxxxxxx> writes: > On Mon, Apr 01, 2019 at 12:01:04AM +0200, Andrei Rybak wrote: >> diff --git a/mailinfo.c b/mailinfo.c >> index b395adbdf2..4ef6cdee85 100644 >> --- a/mailinfo.c >> +++ b/mailinfo.c >> @@ -701,6 +701,13 @@ static int is_scissors_line(const char *line) >> c++; >> continue; >> } >> + if (!memcmp(c, "✂", 3)) { > > This character is tiny. Please add a comment that it's supposed to be > a Unicode scissors character. > > Should we worry about this memcmp() potentially reading past the end > of the string when 'c' points to the last character? Quite honestly, I'd rather document what "scissors" line looks like exactly and make sure no readers would mistake that we'd accept any Unicode character whose name has substring "scissors" in it. Ah, wait, we already do. It is very clear that scissors are either ">8" (for right handers) or "8<" (for lefties) and nothing else. Unless you are sure that you are (and more importantly, can stay to be) exhaustive, adding allowed representations for a thing will force users to learn more non-essential things ("we allow only 8< and >8" vs "we allow only these 7, even though we are aware that there are at least 14 more that we do not allow"---the end-user needs to remember which 7 are allowed) and does not help users. Taking only "black scissors" U+2702 but not all of U+2700 - U+2704 will be a cause for unnecessary end-user complaints "why do you take this but not that one?" Then the next noise would be "why is '-' the only perforation and not U+2014 Em Dash or U+2013 En Dash?" Let's try not to be cute in non-essential things like how a pair of scissors ought to be spelled. If "8<" had worked well for us for the past 10 years, we should just stick to it.