Re: [PATCH 2/3] stripspace: Implement --count-lines option

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2015-10-15 at 19:58:26 +0200, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Tobias Klauser <tklauser@xxxxxxxxxx> writes:
> 
> > diff --git a/Documentation/git-stripspace.txt b/Documentation/git-stripspace.txt
> > index 60328d5..411e17c 100644
> > --- a/Documentation/git-stripspace.txt
> > +++ b/Documentation/git-stripspace.txt
> > @@ -9,7 +9,7 @@ git-stripspace - Remove unnecessary whitespace
> >  SYNOPSIS
> >  --------
> >  [verse]
> > -'git stripspace' [-s | --strip-comments] < input
> > +'git stripspace' [-s | --strip-comments] [-C | --count-lines] < input
> >  'git stripspace' [-c | --comment-lines] < input
> >  
> >  DESCRIPTION
> > @@ -44,6 +44,11 @@ OPTIONS
> >  	be terminated with a newline. On empty lines, only the comment character
> >  	will be prepended.
> >  
> > +-C::
> > +--count-lines::
> > +	Output the number of resulting lines after stripping. This is equivalent
> > +	to calling 'git stripspace | wc -l'.
> 
> I agree with Matthieu that squatting on a short-and-sweet -C is
> unwanted here.

Will drop it in v2.

> > diff --git a/builtin/stripspace.c b/builtin/stripspace.c
> > index f677093..7882edd 100644
> > --- a/builtin/stripspace.c
> > +++ b/builtin/stripspace.c
> > @@ -1,5 +1,6 @@
> >  #include "builtin.h"
> >  #include "cache.h"
> > +#include "parse-options.h"
> >  #include "strbuf.h"
> >  
> >  static void comment_lines(struct strbuf *buf)
> > @@ -12,45 +13,51 @@ static void comment_lines(struct strbuf *buf)
> >  	free(msg);
> >  }
> >  
> > -static const char *usage_msg = "\n"
> > -"  git stripspace [-s | --strip-comments] < input\n"
> > -"  git stripspace [-c | --comment-lines] < input";
> > +static const char * const usage_msg[] = {
> > +	N_("git stripspace [-s | --strip-comments] [-C | --count-lines] < input"),
> > +	N_("git stripspace [-c | --comment-lines] < input"),
> > +	NULL
> > +};
> >  
> >  int cmd_stripspace(int argc, const char **argv, const char *prefix)
> >  {
> >  	struct strbuf buf = STRBUF_INIT;
> > -	int strip_comments = 0;
> > -	enum { INVAL = 0, STRIP_SPACE = 1, COMMENT_LINES = 2 } mode = STRIP_SPACE;
> > -
> > -	if (argc == 2) {
> > -		if (!strcmp(argv[1], "-s") ||
> > -		    !strcmp(argv[1], "--strip-comments")) {
> > -			strip_comments = 1;
> > -		} else if (!strcmp(argv[1], "-c") ||
> > -			   !strcmp(argv[1], "--comment-lines")) {
> > -			mode = COMMENT_LINES;
> > -		} else {
> > -			mode = INVAL;
> > -		}
> > -	} else if (argc > 1) {
> > -		mode = INVAL;
> > -	}
> > +	int comment_mode = 0, count_lines = 0, strip_comments = 0;
> > +	size_t lines = 0;
> > +
> > +	const struct option options[] = {
> > +		OPT_BOOL('s', "strip-comments", &strip_comments,
> > +			 N_("skip and remove all lines starting with comment character")),
> > +		OPT_BOOL('c', "comment-lines", &comment_mode,
> > +			 N_("prepend comment character and blank to each line")),
> > +		OPT_BOOL('C', "count-lines", &count_lines, N_("print line count")),
> > +		OPT_END()
> > +	};
> 
> I think -s and -c are incompatible, so OPT_CMDMODE() might be a
> better fit for them if you are going to switch the command line
> parser to use parse-options.

Ok, will change to OPT_CMDMODE()

> Which makes me strongly suspect that this should be done in at least
> two separate steps.  (1) Use parse-options to parse command line
> without adding the counting at all, followed by (2) Add counting.

Ok, will split the patch as you suggest for v2 (or more if it makes it
easier to review).

> > +	if (!count_lines)
> > +		write_or_die(1, buf.buf, buf.len);
> > +	else {
> > +		struct strbuf tmp = STRBUF_INIT;
> > +		strbuf_addf(&tmp, "%zu\n", lines);
> > +		write_or_die(1, tmp.buf, tmp.len);
> > +	}
> 
> So this is your output code, which gives only the number of lines
> without the cleaned up result.

This should better be a simple printf("%zu\n", lines) I guess?

> > @@ -797,15 +799,19 @@ void strbuf_stripspace(struct strbuf *sb, int skip_comments)
> >  
> >  		/* Not just an empty line? */
> >  		if (newlen) {
> > -			if (empties > 0 && j > 0)
> > +			if (empties > 0 && j > 0) {
> >  				sb->buf[j++] = '\n';
> > +				lines++;
> > +			}
> >  			empties = 0;
> >  			memmove(sb->buf + j, sb->buf + i, newlen);
> >  			sb->buf[newlen + j++] = '\n';
> > +			lines++;
> >  		} else {
> >  			empties++;
> >  		}
> >  	}
> 
> I cannot say that the above is one of the better possible
> implementations of this feature I would think of.
> 
>  (1) If this is performance sensitive, then you do not want to do
>      memmove() etc. to actually touch the contents of the *sb and
>      only increment lines++, because you are not going to show that
>      in the output anyway.
> 
>  (2) If this feature is not performance sensitive, then the best
>      implementation would be not to touch strbuf_stripspace() at all
>      to realize this change, and count the number of '\n' in the
>      cmd_stripspace() just before you use printf("%d\n", lines).
>      That is best from maintainability's point-of-view, because it
>      makes it infinitely less error-prone for future changes of
>      strbuf_stripspace() to forget incrementing lines++ when it adds
>      a newline to the output.
> 
>  (3) If you are going to still munge *sb, even if you are not going
>      to show it at the end, perhaps because that would make it
>      easier to keep track of where the code is scanning to find when
>      you need to increment lines++, then at least the patch should
>      devise a mechanism to make it less likely that the future
>      changes to strbuf_stripspace() would make 'lines' and the
>      number of '\n' in the *sb out-of-sync (hint: perhaps a macro
>      called APPEND_LF(sb, line) or something).  It is easy to append
>      '\n' to sb->buf without incrementing lines++ as the code stands
>      with this patch applied---the patch makes the code less
>      maintainable.
> 
> My gut feeling is that you should do (2), which is the cleanest from
> the maintainability's point-of-view.

Thank you for outlining and assessing these possible implementations. I
agree that my suggested implementation is probably the most naïve one :)
and think that (2) would less intrusive and better to maintain. Will
change the patch accordingly.
> 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]