Re: [PATCH] blame: allow -L n,m to have an m bigger than the file's line count

Jay Soffian <jaysoffian@xxxxxxxxx> · Wed, 10 Feb 2010 14:39:45 -0500

On Wed, Feb 10, 2010 at 1:58 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> While we are talking about touching the vicinity, I think we should
> tighten the -L s,e parsing rules a bit further.
>
> There is an undocumented code that swaps start and end if the given end is
> smaller than the start.  This triggers even when "-L280,300" is mis-typed
> as "-L280,30".  I was bitten by this more than once---when the input does
> not make sense, we should actively error out, instead of doing a wrong
> thing.  I suspect that I coded it that way _only_ to support this pattern:
>
>        -L'/^#endif \/\* !WINDOWS \/\*/,-30'
>
> i.e. "blame 30 lines before the '#endif' line".  But the code also
> internally turns "-L50,-20" into "-L 50,30" and then swaps them to may
> make it "-L30,50"; this was merely an unintended side effect.

I was curious what sed does. At least on my system, sed -n -e '10,1p'
prints just line 1. Seems a bit odd. sed -n -e '1,10000p' just prints
to the end, and doesn't error out if there are less than 10k lines.

> I do want to see -L'/regexp/,-offset' keep working, I do not mind if we
> keep taking "-L50,-20" as an unintuitive way to spell "-L30,50", or reject
> "-L50,-20" as a nonsense.  But I do want to see us reject "-L280,30" as a
> typo.
>
> As to use of more than one -L option, especially when the start (or end
> for that matter) is specified with an regexp, I am of two minds.

Actually, I was not planning on supporting multiple -L options, but rather...

> When annotating the body of two functions, frotz and nitfol, I might
> expect this to work:
>
>    -L'/^int frotz(/,/^}/'  -L'/^int nitfol(/,/^}/'
>
> regardless of the order these functions appear in the blob (i.e. nitfol
> may be defined first).  This requires that parsing of "regexp" needs to
> reset to the beginning of blob for each -L option (iow, multiple -L are
> parsed independently from each other).
>
> But at the same time, if I am actually looking at the blob contents in one
> terminal while spelling the blame command line in another, it would be
> nicer if the multiple -L looked for patterns incrementally.  I may
> appreciate if I can write the above command line as:
>
>    -L'/^int frotz(/,/^}/'  -L'/nitfol/,/^}/'
>
> when I can see in my "less" of the blob contents in the other terminal
> that the first line that has string "nitfol" after the end of the
> definition of "frotz" is the beginning of function "nitfol".
>
> Another thing we _might_ want to consider doing is something like:
>
>    -L'*/^#ifdef WINDOWS/,/^#endif \/\* WINDOWS \/\*/'
>
> to tell it "I don't care to count how many WINDOWS ifdef blocks there are;
> grab all of them".

That was my aim, but the syntax I'd settled on was to use
-L/pattern/..END where END is either a numerical argument or another
pattern. IOW, ".." instead of ",".

> Regardless of how parsing of multiple -L goes, you need to be careful to
> sort the resulting line ranges and possibly coalesce them when there are
> overlaps (e.g. "-L12,+7 -L10,+5" should become "-L10,17").  And be careful
> about refcounting of origin.  You'll be making multiple blame_ent and
> queuing them to the scoreboard when starting, all pointing at the blame
> target blob; the origin blob needs to start with the right number of
> references to keep origin_decref() discarding it.

Understood.

j.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html