RE: [PATCH v2] tr2: log parent process name

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



<emilyshaffer@xxxxxxxxxx>
On May 21, 2021 4:05 PM, Emily Shaffer wrote:
>On Fri, May 21, 2021 at 03:15:16PM -0400, Jeff Hostetler wrote:
>> On 5/20/21 5:05 PM, Emily Shaffer wrote:
>> > - I took a look at Jeff H's advice on using a "data_json" event to log
>> >    this and decided it would be a little more flexible to add a new event
>> >    instead. If we want, it'd be feasible to then shoehorn the GfW parent
>> >    tree stuff into this new event too. Doing it this way is definitely
>> >    easier to parse for Google's trace analysis system (which for now
>> >    completely skips "data_json" as it's polymorphic), and also - I think
>> >    - means that we can add more fields later on if we need to (thread
>> >    info, different fields than just /proc/n/comm like exec path, argv,
>> >    whatever).
>>
>> I could argue both sides of this, so I guess it is fine either way.
>>
>> In GFW I log a array of argv[0] strings in a generic "data_json" event.
>> I could also log additional "data_json" events with more structured
>> data if needed.
>>
>> On the other hand, you're proposing a "cmd_ancestry" event with a
>> single array of strings.  You would have to expand the call signature
>> of the trace2_cmd_ancestry() API to add additional data and inside
>> tr2_tgt_event.c add additional fields to the JSON being composed.
>>
>> So both are about equal.
>>
>> (I'll avoid the temptation to make a snarky comment about fixing your
>> post processing. :-) :-) :-) )
>
>;P
>
>(I don't have much to add - this is an accurate summary of what I thought about, too. Thanks for writing it out.)
>
>>
>> It really doesn't matter one way or the other.
>>
>> > - Jonathan N also pointed out to me that /proc/n/comm exists, and logs
>> >    the "command name" - excluding argv, excluding path, etc. It
>> > seems
>>
>> So you're trying to log argv[0] of the process and not the full
>> command line.  That's what I'm doing.
>
>It's close to argv[0], yeah. POSIX docs indicate it might be truncated in a way that argv[0] hasn't been, but it also doesn't
include the
>leading path (as far as I've seen). For example, a long-running helper script I use with mutt, right now (gaffing on line length in
email to
>help with argv clarity, sorry):
>
>  $ ps aux | grep mutt
>  emilysh+ 4119883  0.0  0.0   6892  3600 pts/6    S+   12:44   0:00 /bin/bash
/usr/local/google/home/emilyshaffer/dotfiles/open-vim-in-
>new-split.sh /var/tmp/mutt-podkayne-413244-1263002-7433772284891386689
>  # comm is truncated to 15ch, except apparently in the cases of some
>  # kernel worker processes I saw with much longer names?
>  $ cat /proc/4119883/comm
>  open-vim-in-new
>  # exe is a link to the executable, which means bash as this is a
>  # script
>  $ ls -lha /proc/4119883/exe
>  lrwxrwxrwx 1 emilyshaffer primarygroup 0 May 21 12:44
>  /proc/4119883/exe -> /usr/bin/bash
>  # cmdline has the whole argv, separated on NUL so it runs together in
>  # editor
>  $ cat /proc/4119883/cmdline
>  /bin/bash/usr/local/google/home/emilyshaffer/dotfiles/open-vim-in-new-split.sh/var/tmp/mutt-podkayne-413244-1263002-
>7433772284891386689
>
>Jonathan N pointed out that the process name (the thing in 'comm') can also be manually manipulated by the process itself, and 'man
>procfs'
>also talks about 'PR_SET_NAME' and 'PR_GET_NAME' operations in 'prctl()', so that tracks. (It doesn't look like we can use prctl()
to find
>out the names of processes besides the current process, though, so the procfs stuff is still needed. Dang.)
>
>>
>> >    like this is a little more safe about excluding personal information
>> >    from the traces which take the form of "myscript.sh
>> >    --password=hunter2", but would still be worrisome for something like
>> >    "mysupersecretproject.sh". I'm not sure whether that means we still
>> >    want to guard it with a config flag, though.
>>
>> You might check whether you get the name of the script or just get a
>> lot of entries with just "/usr/bin/bash".
>
>See above :)
>
>> There's lots of PII in the data stream to worry about.
>> The name of the command is just one aspect, but I digress.
>
>Yes, that's what we've noticed too, so a process name isn't worrying us that much more.
>
>>
>> > - I also added a lot to the commit message; hopefully it's not too
>> >    rambly, but I hoped to explain why just setting GIT_TRACE2_PARENT_SID
>> >    wasn't going to cut it.
>> > - As for testing, I followed the lead of GfW's parentage info - "this
>> >    isn't portable so writing tests for it will suck, just scrub it from
>> >    the tests". Maybe it makes sense to do some more
>> >    platform-specific-ness in the test suite instead? I wasn't sure.
>>
>> yeah, that's probably best.  Unless you can tokenize it properly so
>> that you can predict the results in a HEREDOC in the test source.
>>
>> For example, you might try to test tracing a command (where a
>> top-level "git foo" (SPACE form) spawns a "git-foo" (DASHED form) and
>> check the output for the child.
>
>Yeah, I had trouble with even deciding when to attempt such a check or not.
>> > +	if (reason == TRACE2_PROCESS_INFO_STARTUP)
>> > +	{
>> > +		/*
>> > +		 * NEEDSWORK: we could do the entire ptree in an array instead,
>> > +		 * see compat/win32/trace2_win32_process_info.c.
>> > +		 */
>> > +		char *names[2];
>> > +		names[0] = get_process_name(getppid());
>> > +		names[1] = NULL;
>>
>> You're only logging 1 parent.  That's fine to get started.
>>
>> I'm logging IIRC 10 parents on GFW.  That might seem overkill, but
>> there are lots of intermediate parents that hide what is happening.
>> For example, a "git push" might spawn "git remote-https"
>> which spawns "git-remote-https" which spawn "git send-pack" which
>> spawns "git pack-objects".
>>
>> And that doesn't include who called push.
>>
>> And it's not uncommon to see 2 or 3 "bash" entries in the array
>> because of the bash scripts being run.
>
>Agree. But it's expensive - I didn't find a handy library call to find "parent ID of given process ID", so I think we'd have to
manipulate
>procfs; and so far I only see parent ID in summary infos like /proc/n/status or /proc/n/stat, which contain lots of other info too
and would
>need parsing.
>
>We could reduce the cost a little bit by grabbing the process name from the status or stat as well, and therefore still only
opening one file
>per process, but I'd want to check whether the formats are expected to be stable for those things.
>
>> > +static void fn_command_ancestry_fl(const char *file, int line,
>> > +const char **parent_names) {
>> > +	const char *event_name = "cmd_ancestry";
>> > +	const char *parent_name = NULL;
>> > +	struct json_writer jw = JSON_WRITER_INIT;
>> > +
>> > +	jw_object_begin(&jw, 0);
>> > +	event_fmt_prepare(event_name, file, line, NULL, &jw);
>> > +	jw_object_inline_begin_array(&jw, "ancestry");
>> > +
>> > +	while ((parent_name = *parent_names++))
>> > +		jw_array_string(&jw, parent_name);
>>
>> You're building the array with the immediate parent in a[0] and the
>> grandparent in a[1], and etc.  This is the same as I did in GFW.
>>
>> Perhaps state this in the docs somewhere.
>
>Sure, makes sense. I think I neglected any doc work whatsoever in this patch anyways, whoops :)
>
>> > +	/* cmd_ancestry parent <- grandparent <- great-grandparent */
>> > +	strbuf_addstr(&buf_payload, "cmd_ancestry ");
>> > +	while ((parent_name = *parent_names++)) {
>> > +		strbuf_addstr(&buf_payload, parent_name);
>>
>> Did you want to quote each parent's name?
>
>I'd rather not - since they're going into an array anyway, I'd expect the array delimiters to be enough. Am I being naive? 'normal'
looks to
>me like it's supposed to be mostly human readable anyways, rather than parseable?
>
>> > +	strbuf_addstr(&buf_payload, "ancestry:[");
>> > +	/* It's not an argv but the rules are basically the same. */
>> > +	sq_append_quote_argv_pretty(&buf_payload, parent_names);
>>
>> This will have whitespace delimiters between the quoted strings rather
>> than commas.  Just checking if that's what you wanted.
>>
>> I'm not sure it matters, since this stream is intended for human
>> parsing.
>
>Yeah, it seems fine to me as is.
>
>>
>> We should update Documentation/technical/api-trace2.txt too.
>
>Yep, thanks.
>
>I appreciate the review, Jeff.

I checked the performance of my NonStop parent lookup implementation. It's fast enough that no one would notice (8 microseconds on
the oldest slowest machine I could find).

Just an FYI.
-Randall




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux