Re: [PATCH v2 2/2] repack: repack promisor objects if -a or -A is set

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jonathan Tan <jonathantanmy@xxxxxxxxxx> writes:

> +/*
> + * Write oid to the given struct child_process's stdin, starting it first if
> + * necessary.
> + */
> +static int write_oid(const struct object_id *oid, struct packed_git *pack,
> +		     uint32_t pos, void *data)
> +{
> +	struct child_process *cmd = data;
> +
> +	if (cmd->in == -1) {
> +		if (start_command(cmd))
> +			die("Could not start pack-objects to repack promisor objects");
> +	}
> +
> +	xwrite(cmd->in, oid_to_hex(oid), GIT_SHA1_HEXSZ);
> +	xwrite(cmd->in, "\n", 1);
> +	return 0;
> +}
> +
> +static void repack_promisor_objects(const struct pack_objects_args *args,
> +				    struct string_list *names)
> +{
> +	struct child_process cmd = CHILD_PROCESS_INIT;
> +	FILE *out;
> +	struct strbuf line = STRBUF_INIT;
> +
> +	prepare_pack_objects(&cmd, args);
> +	cmd.in = -1;
> +
> +	/*
> +	 * NEEDSWORK: Giving pack-objects only the OIDs without any ordering
> +	 * hints may result in suboptimal deltas in the resulting pack. See if
> +	 * the OIDs can be sent with fake paths such that pack-objects can use a
> +	 * {type -> existing pack order} ordering when computing deltas instead
> +	 * of a {type -> size} ordering, which may produce better deltas.
> +	 */
> +	for_each_packed_object(write_oid, &cmd,
> +			       FOR_EACH_OBJECT_PROMISOR_ONLY);
> +
> +	if (cmd.in == -1)
> +		/* No packed objects; cmd was never started */
> +		return;
> +	close(cmd.in);
> +
> +	out = xfdopen(cmd.out, "r");

Hmm, it is clever to auto-start the pack-objects (and notice there
wasn't anything needing to pack).  Two things that are worth noting
are:

 - The code takes advantage of the fact that cmd.in being left as -1
   is a sign that start_command() was never called (well, it could
   be that it was called but failed to open pipe---but then we would
   have died inside write_oid(), so we can ignore taht case).  This
   is not relying on its implementation detail---cmd.in would be
   replaced by a real fd to the pipe if start_command() was called.

 - We know that pack-objects reads all its standard input through to
   the EOF before emitting a single byte to its standard output, and
   that is why we can use bidi pipe and not worry about deadlocking
   caused by not reading from cmd.out at all (before closing cmd.in,
   that is).

So I kind of like the cleverness of the design of this code.

> +	while (strbuf_getline_lf(&line, out) != EOF) {
> +		char *promisor_name;
> +		int fd;
> +		if (line.len != 40)
> +			die("repack: Expecting 40 character sha1 lines only from pack-objects.");
> +		string_list_append(names, line.buf);
> +
> +		/*
> +		 * pack-objects creates the .pack and .idx files, but not the
> +		 * .promisor file. Create the .promisor file, which is empty.
> +		 */
> +		promisor_name = mkpathdup("%s-%s.promisor", packtmp,
> +					  line.buf);
> +		fd = open(promisor_name, O_CREAT|O_EXCL|O_WRONLY, 0600);
> +		if (fd < 0)
> +			die_errno("unable to create '%s'", promisor_name);
> +		close(fd);
> +		free(promisor_name);

For now, as we do not know what is the appropriate thing to leave in
the file, empty is fine, but perhaps in the future we may want to
carry forward the info from the existing .promisor file?  What does
it initially contain?  Transport invoked with from_promisor bit
seems to call index-pack with "--promisor" option with no argument,
so this is probably in line with that.  Do we in the future need to
teach "--promisor" option to pack-objects we invoke here, which will
be passed to the index-pack it internally performs, so that we do
not have to do this by hand here?

In any case, don't we need to update the doc for the "--promisor"
option of "index-pack"?  The same comment for the new options added
in the same topic, e.g. "--from-promisor" and "--no-dependents"
options added to "fetch-pack".  It seems that the topic that ended
at 0c16cd499d was done rather hastily and under-documented?

> @@ -440,6 +513,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>  
>  	/* End of pack replacement. */
>  
> +	reprepare_packed_git(the_repository);
> +

I do not think this call hurts, but why does this become suddenly
necessary with _this_ patch?  Is it an unrelated bugfix?

>  	if (delete_redundant) {
>  		int opts = 0;
>  		string_list_sort(&names);

Thanks.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux