Re: [PATCH/RFC] diff: Make numstat machine friendly also for renames

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano <junkio@xxxxxxx> wrote:
> Jakub Narebski <jnareb@xxxxxxxxx> writes:
> 
>>>> The numstat format for rename is now
>>>>
>>>>   added deleted TAB path for "src" TAB path for "dst" LF
>>>>
>>>> or if -z option is used
>>>>
>>>>   added deleted TAB path for "src" NUL NUL path for "dst" NUL
>>> 
>>> Why two NULs?
>>
>> That was the only way I could think of to separate pre-image name
>> from posi-image name for renames. Note that file name might look like
>> (part of) diffstat line, and there is no 'status' field in the
>> numstat to mark rename (as there is in "git diff-tree --raw" output).
> 
> The --stat format is for human consumption, and --numstat (be it
> with -z or without) is for machines, so I am not opposed to a
> format change that gives information that is already computed
> but currently is hard to parse.  If the format change breaks
> existing scripts, we might want to do --numstat-extended,
> though...
> 
> For example, I do not see a reason not to add "R98" in there.
> I.e.
> 
> 	added deleted status TAB "src" (TAB "dst"){0,1} LF
> 	added deleted status NUL "src" (NUL "dst"){0,1} NUL
> 
> where the dst path is present only when status says it is a
> rename/copy, just like the --raw format.

That is a good idea, but wouldn't it break existing scripts? Well,
break more than a bit hacky idea of using NUL NUL as separator between
pre-image name and post-image name.

This would change output in every case, while my proposal doesn't change
output for the case without renames. '-M' should also work correctly,
perhaps scripts using it getting wrong filename. It is '-z -M' that changes
most.

>> Did you mean --stat here?
> 
> No, I did mean --summary.  But that was foolish of me.  I forgot
> that it had the same { namepart => namepart } issue.

Ah. I somehow didn't get then that you meant for --numstat to have only
post-image names, and get pre-image names from rename information in
--summary. But as you have noticed it wouldn't help: rename information
is in the same for-humans format.

>>>> @@ -949,11 +955,19 @@ static void show_numstat(struct diffstat_t* data,
>>>>  			printf("-\t-\t");
>>>>  		else
>>>>  			printf("%d\t%d\t", file->added, file->deleted);
>>>> -		if (options->line_termination && !file->is_renamed &&
>>>> +		if (options->line_termination &&
>>>>  		    quote_c_style(file->name, NULL, NULL, 0))
>>>>  			quote_c_style(file->name, NULL, stdout, 0);
>>>>  		else
>>>>  			fputs(file->name, stdout);
>>>> +		if (file->is_renamed) {
>>>> +			printf("%s", options->line_termination ? "\t" : "\0\0");
>
> What I was hoping you to notice was that printf("%s", "\0\0")
> thing.  %s would not even notice that the const char[] literal
> is 2 bytes long.

Ooops. Shame on me. That is the result of trying to be too smart... 
and changing separator between pathnames for rename from NUL to NUL NUL.

-- 
Jakub Narebski
Poland
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux