On Thu, May 16, 2019 at 02:44:44PM -0700, Emily Shaffer wrote: > grep_buffer creates a struct grep_source gs and calls grep_source() > with it. However, gs.name is null, which eventually produces a > segmentation fault in > grep_source()->grep_source_1()->show_line() when grep_opt.status_only is > not set. Funny wrapping here. > This seems to be unreachable from existing commands but is reachable in > the event that someone rolls their own revision walk and neglects to set > rev_info->grep_filter->status_only. Conceivably, someone might want to > print all changed lines of all commits which match some regex. > > To futureproof, give developers a heads-up if grep_source() is called > and a valid name field is expected but not given. This path should be > unreachable now, but in the future may become reachable if developers > want to add the ability to use grep_buffer() in prepared in-core data > and provide hints to the user about where the data came from. So I guess this is probably my fault, as I was the one who factored out the grep_source bits long ago (though I would also not be surprised if it could be triggered before, too). I think adding a BUG() is the right thing to do. > I've added the BUG() line to grep_source(). I considered adding it to > grep_source_1() but didn't see much difference. Looks like > grep_source_1() exists purely as a helper to grep_source() and is never > called by anybody else... but it's the function where the crash would > happen. So I'm not sure. I agree it probably doesn't matter. I'd probably stick at at the top of grep_source_1(), just with the rationale that it could possibly catch more cases if somebody ever refactors grep_source() to have two different callers (e.g., imagine we later add a variant that takes more options, but leave the old one to avoid disrupting existing callers). > I also modified the wording for the BUG("") statement a little from > jrn's suggestion to hopefully make it more explicit, but I welcome > wordsmithing. > [...] > + BUG("grep call which could print a name requires " > + "grep_source.name be non-NULL"); It looks fine to me. The point is that nobody should ever see this. And if they do, we should already get a file/line number telling us where the problem is (and a coredump to poke at). So as long as it's enough to convince a regular user that they should make a report to the mailing list, it will have done its job. > diff --git a/grep.c b/grep.c > index 0d50598acd..6ceb880a8c 100644 > --- a/grep.c > +++ b/grep.c > @@ -2021,6 +2021,9 @@ static int chk_hit_marker(struct grep_expr *x) > > int grep_source(struct grep_opt *opt, struct grep_source *gs) > { > + if (!opt->status_only && gs->name == NULL) > + BUG("grep call which could print a name requires " > + "grep_source.name be non-NULL"); > /* > * we do not have to do the two-pass grep when we do not check > * buffer-wide "all-match". Minor nit, but can we put a blank line between the new block and the comment? > @@ -2045,7 +2048,11 @@ int grep_buffer(struct grep_opt *opt, char *buf, unsigned long size) > struct grep_source gs; > int r; > > - grep_source_init(&gs, GREP_SOURCE_BUF, NULL, NULL, NULL); > + /* TODO: In the future it may become desirable to pass in the name as > + * an argument to grep_buffer(). At that time, "(in core)" should be > + * replaced. > + */ > + grep_source_init(&gs, GREP_SOURCE_BUF, _("(in core)"), NULL, NULL); Hmm. I don't see much point in this one, as it would just avoid triggering our BUG(). If somebody is adding new grep_buffer() callers that don't use status_only, wouldn't we want them to see the BUG() to know that they need to refactor grep_buffer() to provide a name? -Peff