Re: [PATCH] git-remote-mediawiki: Fix a bug in a regexp

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jun 08, 2013 at 08:38:56PM +0200, Matthieu Moy wrote:

> Célestin Matte <celestin.matte@xxxxxxxxxx> writes:
> 
> > In Perl, '\n' is not a newline, but instead a literal backslash followed by an
> > "n". As the output of "rev-list --first-parent" is line-oriented, what we want
> > here is a newline.
> 
> This is right, but the code actually worked the way it was. I'm not
> sure, but my understanding is that '\n' is the string "backslash
> followed by n", but interpreted as a regexp, it is a newline.

Yes, the relevant doc (from "perldoc -f split") is:

  The pattern "/PATTERN/" may be replaced with an expression to specify
  patterns that vary at runtime.  (To do runtime compilation only once,
  use "/$variable/o".)

So it is treating "\n" as an expression and compiling the regex each
time through (though I think modern perl may be smart enough to realize
it is a constant expression and compile the regex only once). You would
get the same behavior with this:

  split $arg, $data;

if $arg contained '\n'. Of course, you _also_ get the same thing if you
use a literal newline (either "\n" or if $arg contained a literal
newline), because they function the same in a regex. In other words, it
does not matter which you use because perl's interpolation of "\n" and
the regex expansion of "\n" are identical: t hey both mean a newline.

A more subtle example that shows what is going on is this:

  split '.', $data;

If you feed that "foo.bar.baz", it does not split it into three words;
each character is a delimiter, because the dot is compiled to a regex.

> The new code looks better than the old one, but the log message may be
> improved.

Agreed. I think the best explanation is something like:

  Perl's split function takes a regex pattern argument. You can also
  feed it an expression, which is then compiled into a regex at runtime.
  It therefore works to pass your pattern via single quotes, but it is
  much less obvious to a reader that the argument is meant to be a
  regex, not a static string. Using the traditional slash-delimiters
  makes this easier to read.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]