Hi, On Tue, Jul 28, 2020 at 4:21 AM Junio C Hamano <gitster@xxxxxxxxx> wrote: > > "Hariom Verma via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > > > From: Hariom Verma <hariom18599@xxxxxxxxx> > > > > Currently, ref-filter only supports printing email with arrow brackets. > > This is the first time I heard the term "arrow bracket". Aren't > they more commonly called angle brackets? Yeah. Sorry about that. > > Let's add support for two more email options. > > - trim : print email without arrow brackets. > > Why would this be useful? It might be useful for using the ref-filter's logic in pretty.c (especially for `--pretty` formats like `%an` and `%cn`) > > - localpart : prints the part before the @ sign > > Meaning I'd get "<gitster" for me? No, you'll get "gitster". > Building small pieces of new feature in each patch is good, and > adding tests to each step is also good. Why not do the same for > docs? Yeah, I agree with you. I should have focused on documentation too. > > +static struct email_option{ > > Missing SP. I'll fix it. > > + enum { EO_INVALID, EO_RAW, EO_TRIM, EO_LOCALPART } option; > > +} email_option; > > + > > @@ -1040,10 +1044,26 @@ static const char *copy_email(const char *buf) > > const char *eoemail; > > if (!email) > > return xstrdup(""); > > - eoemail = strchr(email, '>'); > > The original code prepares to see NULL from this strchr(); that is > why it checks eoemail for NULL and returns an empty string---the > data is broken (i.e. not an address in angle brackets), which this > code cannot do anything about---in the later part of the code. I think this commit still takes care of NULL. After the switch-case statements, code consists of: ``` if (!eoemail) return xstrdup(""); ``` Which checks eoemail for NULL. And will return empty string if address is not in angle brackets. Same applies for local-part too. If '@' is not present in email address, it will still return empty string. > > + switch (email_option.option) { > > + case EO_RAW: > > + eoemail = strchr(email, '>') + 1; > > And this breaks the carefully laid out error handling by the > original code. Adding 1 to NULL is quite undefined. Yeah. I'll take care of it in the next version. > > + break; > > + case EO_TRIM: > > + email++; > > + eoemail = strchr(email, '>'); > > + break; > > + case EO_LOCALPART: > > + email++; > > + eoemail = strchr(email, '@'); > > The undocumented design here is that you want to return "hariom" for > "<hariom@xxxxxxxxx>", i.e. out of the "trimmed" e-mail, the part > before the at-sign is returned. > > If the data were "<hariom>", you'd still want to return "hariom" no? > But because you do not check for NULL, you end up returning an empty > string. I never heard of email address without '@' symbol. Thats why I returned empty string. Will fix that too. > I think you want to cut at the first '@' or '>', whichever comes > first. If email data can be without '@' symbol, then I guess "yes". > > + break; > > + case EO_INVALID: > > + default: > > Invalid and unhandled ones are silently ignored and not treated as > an error? I would have thought that at least the "default" one > would be a BUG(), as you covered all the possible values for the > enum with case arms. I wouldn't be surprised if seeing EO_INVALID > is also a BUG(), i.e. the control flow that led to the caller to > call this function with EO_INVALID in email_option.option is likely > to be broken. It's not like you return "" to protect yourself when > fed a bad data from objects---a bad value in .option can only come > here if the parser you wrote for "--format=<string>" produced a > wrong result. Christian <chriscool@xxxxxxxxxxxxx> also suggested me the same. Will fix this too. BTW, on master "{author,committer,tagger}email:<xyz>" does not print any error. > > + return xstrdup(""); > > + } > > + > > if (!eoemail) > > return xstrdup(""); > > - return xmemdupz(email, eoemail + 1 - email); > > + return xmemdupz(email, eoemail - email); > > } > > > > static char *copy_subject(const char *buf, unsigned long len) > > @@ -1113,7 +1133,7 @@ static void grab_person(const char *who, struct atom_value *val, int deref, void > > continue; > > if (name[wholen] != 0 && > > strcmp(name + wholen, "name") && > > - strcmp(name + wholen, "email") && > > + !starts_with(name + wholen, "email") && > > !starts_with(name + wholen, "date")) > > continue; > > if (!wholine) > > @@ -1124,8 +1144,16 @@ static void grab_person(const char *who, struct atom_value *val, int deref, void > > v->s = copy_line(wholine); > > else if (!strcmp(name + wholen, "name")) > > v->s = copy_name(wholine); > > - else if (!strcmp(name + wholen, "email")) > > + else if (starts_with(name + wholen, "email")) { > > + email_option.option = EO_INVALID; > > + if (!strcmp(name + wholen, "email")) > > + email_option.option = EO_RAW; > > + if (!strcmp(name + wholen, "email:trim")) > > + email_option.option = EO_TRIM; > > + if (!strcmp(name + wholen, "email:localpart")) > > + email_option.option = EO_LOCALPART; > > The ref-filter formatting language already knows many "colon plus > modifier" suffix like "refname:short" and "contents:body", but I do > not think we have ugly repetition like the above code to parse them. > Perhaps the addition for "email:<whatever>" can benefit from > studying and mimicking existing practices a bit more? > For "email:<whatever>", even If I parse that <whatever>. I still make comparison something like: ``` if (!modifier) email_option.option = EO_RAW; else if (!strcmp(modifier, "trim")) email_option.option = EO_TRIM; else if (!strcmp(arg, "localpart")) email_option.option = EO_LOCALPART; ``` Thanks, Hariom