Re: [PATCH 0/5] run-command API: get rid of "argv"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 22, 2021 at 05:04:02PM +0100, Ævar Arnfjörð Bjarmason wrote:

> This series is an alternate but more thorough way to solve the pager
> segfault reported by Enzo Matsumiya[1], and more generally avoids
> similar issues in the future.
> 
> That the run-command API exposed two subtly different ways of doing
> the same thing wouldn't only lead to the sort of bug reported in [1],
> but also made memory management around it rather painful. As noted by
> Jeff King in[2]:
> 
>     I'd like to eventually get rid of the argv interface entirely
>     because it has memory-ownership semantics that are easy to get
>     wrong.

Yeah, unsurprisingly I'm in favor of this direction (and in fact started
looking at myself before seeing your responses). It's big and complex
enough that I do worry about prepending it in front of the segfault bug
fix being discussed.

> As noted in 5/5 we've still got a similar issue with "env" and
> "env_array". I've got a follow-up series that similarly removes "env"
> which we can do at some point (it's much smaller than this one), but
> for now let's focus on "argv".

I think we should probably do both, though I am OK with doing it
separately. There are fewer callers for "env", but I found more
ancillary cleanup necessary (e.g., "const char **" versus "const char
*const *" headaches).

> Ævar Arnfjörð Bjarmason (5):
>   archive-tar: use our own cmd.buf in error message
>   upload-archive: use regular "struct child_process" pattern
>   run-command API users: use strvec_pushv(), not argv assignment
>   run-command API users: use strvec_pushl(), not argv construction
>   run-command API: remove "argv" member, always use "args"

I left a few comments on individual patches. I had done a rough cut at
this, too. One big difference is that I used the opportunity to clean up
some ugly and error-prone uses of argv that are now unnecessary. For
instance:

diff --git a/builtin/notes.c b/builtin/notes.c
index 2b2bac43f3..85d1abad88 100644
--- a/builtin/notes.c
+++ b/builtin/notes.c
@@ -134,14 +134,13 @@ static void copy_obj_to_fd(int fd, const struct object_id *oid)
 
 static void write_commented_object(int fd, const struct object_id *object)
 {
-	const char *show_args[5] =
-		{"show", "--stat", "--no-notes", oid_to_hex(object), NULL};
 	struct child_process show = CHILD_PROCESS_INIT;
 	struct strbuf buf = STRBUF_INIT;
 	struct strbuf cbuf = STRBUF_INIT;
 
 	/* Invoke "git show --stat --no-notes $object" */
-	strvec_pushv(&show.args, show_args);
+	strvec_pushl(&show.args, "show", "--stat", "--no-notes",
+		     oid_to_hex(object), NULL);
 	show.no_stdin = 1;
 	show.out = -1;
 	show.err = 0;

The show_args variable is error-prone in two ways:

  - the magic number "5" must be in sync with the rest of the array. In
    this case it's superfluous and could just be removed, but I'll give
    a related example below.

  - we have to remember to include the trailing NULL. We have to for
    pushl(), too, but in that case the compiler will warn us when we
    omit it.

Here's another one:

@@ -943,23 +941,22 @@ static int run_receive_hook(struct command *commands,
 
 static int run_update_hook(struct command *cmd)
 {
-	const char *argv[5];
+	const char *hook_cmd;
 	struct child_process proc = CHILD_PROCESS_INIT;
 	int code;
 
-	argv[0] = find_hook("update");
-	if (!argv[0])
+	hook_cmd = find_hook("update");
+	if (!hook_cmd)
 		return 0;
 
-	argv[1] = cmd->ref_name;
-	argv[2] = oid_to_hex(&cmd->old_oid);
-	argv[3] = oid_to_hex(&cmd->new_oid);
-	argv[4] = NULL;
+	strvec_push(&proc.args, hook_cmd);
+	strvec_push(&proc.args, cmd->ref_name);
+	strvec_push(&proc.args, oid_to_hex(&cmd->old_oid));
+	strvec_push(&proc.args, oid_to_hex(&cmd->new_oid));
 
 	proc.no_stdin = 1;
 	proc.stdout_to_stderr = 1;
 	proc.err = use_sideband ? -1 : 0;
-	strvec_pushv(&proc.args, argv);
 	proc.trace2_hook_name = "update";

In this case the magic "5" really is important, and we get rid of it
(and again don't need to worry about the terminating NULL).

I'm on the fence on how important it is to do these cleanups. IMHO they
are half of what really sells the change in the first place (since the
other bug can pretty easily be fixed without it).

But maybe it is piling too much onto what is already a pretty big
change. The cleanups could be done individually later.

diff --git a/daemon.c b/daemon.c
index cc278077d2..4a000ee4af 100644
--- a/daemon.c
+++ b/daemon.c
@@ -329,10 +329,15 @@ static int run_access_hook(struct daemon_service *service, const char *dir,
 	char *eol;
 	int seen_errors = 0;
 
+	strvec_push(&child.args, access_hook);
+	strvec_push(&child.args, service->name);
+	strvec_push(&child.args, path);
+	strvec_push(&child.args, hi->hostname.buf);
+	strvec_push(&child.args, get_canon_hostname(hi));
+	strvec_push(&child.args, get_ip_address(hi));
+	strvec_push(&child.args, hi->tcp_port.buf);
+
 	child.use_shell = 1;
-	strvec_pushl(&child.args, access_hook, service->name, path,
-		     hi->hostname.buf, get_canon_hostname(hi),
-		     get_ip_address(hi), hi->tcp_port.buf, NULL);
 	child.no_stdin = 1;
 	child.no_stderr = 1;
 	child.out = -1;

I had other changes from yours like this. This is purely cosmetic, and I
could see arguments either way. I find the one-per-line version a bit
easier to read. Even though it repeats child.args over and over, it's
easy to look past since it's all aligned.

I'm OK calling that bike-shedding, but I offer it mostly in case you
didn't try it the other way and actually like my color. ;)

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux