Re: [PATCH 2/3] fsck_walk(): optionally name objects on the go

Junio C Hamano <gitster@xxxxxxxxx> · Thu, 14 Jul 2016 10:03:29 -0700

Johannes Schindelin <johannes.schindelin@xxxxxx> writes:

> Note that this patch opts for decorating the objects with plain strings
> instead of full-blown structs (à la `struct rev_name` in the code of
> the `git name-rev` command), for several reasons:
>
> - the code is much simpler than if it had to work with structs that
>   describe arbitrarily long names such as "master~14^2~5:builtin/am.c",
>
> - the string processing is actually quite light-weight compared to the
>   rest of fsck's operation,
>
> - the caller of fsck_walk() is expected to provide names for the
>   starting points, and using plain and simple strings is just the
>   easiest way to do that.

Simpler is good; we can always optimize something well-isolated like
this later if it proves necessary.

> +static char *get_object_name(struct fsck_options *options, struct object *obj)
> +{
> +	return options->object_names ?
> +		lookup_decoration(options->object_names, obj) : NULL;
> +}
> +
> +static void put_object_name(struct fsck_options *options, struct object *obj,
> +	const char *fmt, ...)
> +{
> +	va_list ap;
> +	char *existing = lookup_decoration(options->object_names, obj);
> +	struct strbuf buf = STRBUF_INIT;

When reading a few early calling sites, it wasn't quite obvious how
the code avoids the "naming" when .object_names decoration is not
initialized (which is tied to the --name-objects option to decide if
the feature needs to be triggered).  The current "if get_object_name
for the containing object gives us NULL, then we refrain from
calling put_object_name()" may be good enough, but having an early
return similar to get_object_name() would make it easier to grok,
i.e.

	get_object_name(...)
        {
        	if (!options->object_names)
                	return NULL;
		return lookup_decoration(...);
	}

	put_object_name(...)
        {
		... decls ...

        	if (!options->object_names)
                	return NULL;
		existing = lookup_decoration(...);
                if (existing)
                	return existing;
		...
	}

It is a minor point, as the caller needs to check "name" that is the
value returned from get_object_name() for the containing object to
avoid wasting cycles to compute the parameters to pass to
put_object_name() anyway.

>  	while (tree_entry(&desc, &entry)) {
>  		int result;
>  
> +		if (name) {
> +			struct object *obj = parse_object(entry.oid->hash);

This worries me somewhat.  IIRC, "git fsck" uses object->parsed to
tell between objects that are unreachable or not and act differently
so I would fear that parsing the object here would screw up that
logic, when the call comes from fsck_dir() -> fsck_sha1_list() ->
fsck_sha1() -> fsck_obj() -> fsck_walk() -> fsck_walk_tree()
codepath.  Is it no longer the case, I wonder?

I see in the same loop there is a call to lookup_tree()->object, which
probably is how the existing code avoids that issue?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html