Re: [PATCH 1/1] rebase -r: let `label` generate safer labels

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Junio,

On Mon, 2 Sep 2019, Junio C Hamano wrote:

> Phillip Wood <phillip.wood123@xxxxxxxxx> writes:
>
> >>   		for (p1 = label.buf; *p1; p1++)
> >> -			if (isspace(*p1))
> >> +			if (!(*p1 & 0x80) && !isalnum(*p1))
> >>   				*(char *)p1 = '-';
> >
> > I'm sightly concerned that this opens the possibility for unexpected
> > effects if two different labels get sanitized to the same string. I
> > suspect it's unlikely to happen in practice but doing something like
> > percent encoding non-alphanumeric characters would avoid the problem
> > entirely.
>
> I'd rather see 'x' used instead of '-' (double-or-more dashes and
> leading dash in refnames may currently be allowed but double-or-more
> exes and leading ex would be much more likely to stay valid) if we
> just want to redact invalid characters.

Hmm. Let's take a concrete example from the VFS for Git fork:

	Merge pull request #160: trace2:gvfs:experiment Add experimental regions and data events to help diagnose checkout and reset perf problems

(Yes, we have some quite verbose merge commits, with very, very long
onelines. Not a good practice, we stopped doing it, but it was well
within what Git allows.)

And now use dashes to encode all white-space:

	Merge-pull-request-#160:-trace2:gvfs:experiment-Add-experimental-regions-and-data-events-to-help-diagnose-checkout-and-reset-perf-problems

Pretty long, but looks okay. Of course, it does not work, because colons. So here is the label with Matt's patch:

	Merge-pull-request--160--trace2-gvfs-experiment-Add-experimental-regions-and-data-events-to-help-diagnose-checkout-and-reset-perf-problems

And here is the label with your proposed xs.

	Mergexpullxrequestxx160xxtrace2xgvfsxexperimentxAddxexperimentalxregionsxandxdataxeventsxtoxhelpxdiagnosexcheckoutxandxresetxperfxproblems

I cannot speak for you, of course, but I can speak for myself: this is
not only way too reminiscent of xoxoxothxbye, but it is also really,
totally unreadable.

If you care deeply about double dashes and leading dashes, how about
this instead?

		char *from, *to;

   		for (from = to = label.buf; *from; from++)
			if ((*from & 0x80) || isalnum(*from))
				*(to++) = *from;
			/* avoid leading dash and double-dashes */
			else if (to != label.buf && to[-1] != '-')
   				*(to++) = '-';
		strbuf_setlen(&label, to - label.buf);

That would result in

	Merge-pull-request-160-trace2-gvfs-experiment-Add-experimental-regions-and-data-events-to-help-diagnose-checkout-and-reset-perf-problems

> I see there are "lets make sure it is unique by suffixing "-%d" in
> other codepaths; would that help if this piece of code yields a
> label that is not unique?

I'm way ahead of you. The sequencer already goes out of its way to
guarantee the uniqueness of the labels (it's part of the design, as
applied in 1644c73c6d4 (rebase-helper --make-script: introduce a flag to
rebase merges, 2018-04-25)).

The patch you are looking at in this thread is only concerned about the
initial phase, `label_oid()` does a lot more. Not only does it make the
label unique (case-insensitively!), it also prevents it from looking
like a full 40-hex digit SHA-1, so that we can guarantee that unique
abbreviations of commit hashes will work as labels, too.

Ciao,
Dscho




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux