On Wed, May 12, 2021 at 5:26 AM Jeff Hostetler <git@xxxxxxxxxxxxxxxxx> wrote: > > On 5/11/21 8:44 PM, Elijah Newren wrote: > > On Tue, May 11, 2021 at 4:12 PM Jeff Hostetler <git@xxxxxxxxxxxxxxxxx> wrote: > >> > >> On 5/11/21 4:12 PM, Elijah Newren wrote: > >>> On Tue, May 11, 2021 at 12:06 PM Jeff Hostetler <git@xxxxxxxxxxxxxxxxx> wrote: > >>>> > >>>> On 5/11/21 2:34 PM, Elijah Newren via GitGitGadget wrote: > >>>>> From: Elijah Newren <newren@xxxxxxxxx> > >>>>> > >>>>> Signed-off-by: Elijah Newren <newren@xxxxxxxxx> > >>>>> --- > >>>>> dir.c | 43 +++++-- > >>>>> t/t7063-status-untracked-cache.sh | 205 ++++++++++++++++++------------ > >>>>> t/t7519-status-fsmonitor.sh | 8 +- > >>>>> 3 files changed, 155 insertions(+), 101 deletions(-) > >>>>> > >>>>> diff --git a/dir.c b/dir.c > >>>>> index 3474e67e8f3c..122fcbffdf89 100644 > >>>>> --- a/dir.c > >>>>> +++ b/dir.c > >>>>> @@ -2760,15 +2760,34 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d > >>>>> return root; > >>>>> } > >>>>> > >>>>> +static void trace2_read_directory_statistics(struct dir_struct *dir, > >>>>> + struct repository *repo, > >>>>> + const char *path) > >>>>> +{ > >>>>> + if (!dir->untracked) > >>>>> + return; > >>>>> + trace2_data_string("read_directory", repo, "path", path); > >>>> > >>>> I'm probably just nit-picking here, but should this look more like: > >>> > >>> nit-picking and questions are totally fine. :-) Thanks for reviewing. > >>> > >>>> > >>>> if (path && *path) > >>>> trace2_data_string(...) > >>> > >>> path is always non-NULL (it'd be an error to call read_directory() > >>> with a NULL path). So the first part of the check isn't meaningful > >>> for this particular code. The second half is interesting. Do we want > >>> to omit the path when it happens to be the toplevel directory (the > >>> case where !*path)? The original trace_performance_leave() calls > >>> certainly didn't, and I was just trying to provide the same info they > >>> do, as you suggested. I guess people could determine the path by > >>> knowing that the code doesn't print it when it's empty, but do we want > >>> trace2 users to need to read the code to figure out statistics and > >>> info? > >> > >> that's fine. it might be easier to just always print it (even if > >> blank) so that post-processors know that rather than have to assume > >> it. > >> > >>> > >>>> if (!dir->untracked) > >>>> return; > >>>> > >>>> Then when you add the visitied fields in the next commit, > >>>> you'll have the path with them (when present). > >>> > >>> There is always a path with them, it's just that the empty string > >>> denotes the toplevel directory. > >>> > >>>> (and it would let you optionally avoid the tmp strbuf in > >>>> the caller.) > >>> > >>> The path in read_directory() is not necessarily NUL-delimited, so > >>> attempting to use it as-is, or even with your checks, would cause us > >>> to possibly print garbage and do out-of-bounds reads. We need the tmp > >>> strbuf. > >>> > >> > >> I just meant, "if (!len) pass NULL, else build and pass tmp.buf". > > > > Ah, gotcha, that's why you were checking non-NULL. > > > > However, what about the other case when len is nonzero. Let's say > > that len = 8 and path points at > > "filename*%&#)aWholeBunchOfTotalGarbageAfterTheRealFilenameThatShouldNotBeReadOrIncluded\0\0\0\0\0\0\0\0\0\0" > > ? > > > > How do you make it print "filename" and only "filename" without the > > other stuff without using the tmp strbuf? > > > > I was still saying to use the "strbuf tmp" in the non-zero len case, > but just pass NULL (or "") for the len==0 case. Ah, now I see what you were saying. Sorry for not getting it earlier. > Alternatively, since `trace2_read_directory_statistics() a static > local function, we could move all of the path manipulation into it. > > static void emit_stats( > struct dir_struct *dir, > struct repository *repo, > const char* path_buf, > size_t path_len) > { > if (!path_len) > trace2_data_string("read_directory", repo, > "path", ""); > else { > struct strbuf tmp = STRBUF_INIT; > strbuf_add(&tmp, path_buf, path_len); > trace2_data_string("read_directory", repo, > "path", tmp.buf); > strbuf_release(&tmp); > } > ... the rest of intmax stats ... > } Makes sense. > BTW, could we also rename your stats function? I've been trying > to keep the "trace2_" prefix reserved for the Trace2 API. Sure, will do.