Re: multiline output variables.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Well, this took me longer to get to than I had hoped, but it came out
pretty well:

I've rewritten my solution from scratch in order to clean it up and
document it.  Find the diff from 2.59b attached.

Some things are worth nothing:

I've made a new macro that is just the max number of sed commands that can
be safely used, and written things in terms of it.

I realized that not only needn't config.status do the job of breaking the
sed program up into fragments, it needn't do the job of escaping the
results either.  Now, at ./configure time, the exact sed program
fragments, fully escaped, are output into verbatim here documents (i.e.
ones whose terminator is quoted) in config.status.

It was suggested that grep -c be used to make sure that no extra
delimiters were found in the sed program.  grep -c counts lines with
matches, not actual matches, so I wrote a wacky sed script to do the job.
Does somebody have a better portable solution to this?

Rather than counting the delimiters to just _notice_ when an output
variable containing the delimiter would foul up the escaping mechanism, I
use it to instead modify the delimiter and redo the whole process.  It's
now guaranteed to always work, regardless of the contents of the
variables.

None of the escaping rigamarole is needed for _AC_SUBST_FILES, since
the values of such output variables don't end up inside of sed s///.  I
therefore don't escape them at all.  Note that this means that if an
AC_SUBST_FILE'd variable yielded a filename with a comma or backslash in
it, the sed script now does not have those characters escaped.  Is this a
problem?  Did the old behavior even yield valid sed code in those rare
cases where such a value resulted?

There is one sed program that is applied prior to those generated to deal
with output variables.  These deal with things like @srcdir@.  I have left
this entirely unchanged.  Since these things probably should never be able
to have multiline values, I figure this is no loss.



There are two more important issues with this code, which I haven't
addressed in this patch:

AC_SUBST_FILE:

If you AC_SUBST_FILE(foo) and AC_SUBST_FILE(bar), then an input file with
a line with "@bar@@foo@" can generate the contents of the two files in
either order, depending on order in which @foo@ and @bar@ are
interpolated.  I think the current behavior is to first output the file
for the variable first AC_SUBST_FILE'ed, which may well be a different
order than that in which the output variables appear in the input file.

This seems like a bug to me.  I tried to figure a way to interpolate the
variables in the order that they appear, but I think this is impossible
with portable sed code unless you're willing to insert some spurious
newlines around the instances of the output variables (clearly not
acceptible).

On the other hand, does anyone actually use AC_SUBST_FILE'd variables in
any way except to put them on a line by themselves?  Note that at least
some seds (perhaps all?) actually insert the file entirely before the line
with the output variable.  So "fish\nbait@bar@shop" becomes
"fish\nfile\nbaitshop", not "fish\nbaitfile\nshop".  This seems
sufficiently wacky to me that I expect no one uses it this way.

If indeed everybody uses these on lines by themselves, could we require
that?  This would have the advantage (perhaps small) that the newline
following the output variable could be deleted.  This is the behavior I
would expect if I had an output variable set to /dev/null, and in any
other case, the file will provide its own terminal newline.


Recursive output variable:

If there are less than 48 output variables, they are all recursively
expanded.  That is if shell variable foo is the string "@bar@", the
generated file ultimately holds the value of bar, not "@bar@".  This is
perhaps desirable and perhaps not.  It does of course make it possible to
form loops that cause generation of the output file to never complete.

However, if more output variables are defined, then more than one sed
program is needed to apply all the interpolations.  If the first program
contains the definition of @bar@ above, and the second one has @foo@, now
@foo@ is _not_ recursively interpolated.  Again, it's probaly fine to not
recursively interpolate, but we now have two different behaviors,
depending not on choices about the variables, but on something far more
obscure, and not documented for the autoconf user (The 48-variable limit
is a detail of how _AC_OUTPUT_FILES is implemented.).

It is possible, but irritating to always recursively interpolate: the file
is generated from its inputs by applying all the sed programs.  This
result then has all the sed programs applied again.  If the result of the
second application is the same as the first, interpolation is complete;
otherwise, the second result replaces the first and the programs are
applied again and again until the results change no more.  This scheme
can also allow the contents of files included by AC_SUBST_FILE to have
output variables interpolated.

Precluding recursively interpolation seems more difficult, since this
requires either that sed be used to somehow only process the unprocessed
portion of each line (a moderate pain in the rear made vastly worse by my
just-added support of multiline output variables) or that the values of
output variables be escaped.

Quadrigraph processing might be an obvious means of the latter, but it is
ineffective.  Consider that changing "@foo@" to "@@&t@foo@&t@@" still
leaves "@foo@" as a substring, and "@f@&t@o@&t@o@" leaves "@f@" and "@o@"
as substrings.  It would seem that a new syntax would be needed, such as
"@foo@" -> "@@=f@@=o@@=o@@".  However, even with an effective escape
mechanisim, those escapes would need to be applied to every @ character in
the output variable values.  Note also that such an escape could not start
or end with @.  Consider "@foo@nonvar@" with "@foo" -> "no@var" and
"@varnonvar@" -> something else.

I don't have a solution to this, other than the current one of ignoring it
until somebody actually has a problem.  At the very least however, it
should be documented that the behavior can be very unpredictible.

-Dan
--- status.m4.old	2004-08-20 11:28:22.000000000 -0400
+++ status.m4	2005-01-24 15:57:04.296915200 -0500
@@ -850,7 +850,15 @@
 m4_define([AC_LIST_FILES])
 m4_define([AC_LIST_FILES_COMMANDS])
 
-
+# _AC_SED_CMD_LIMIT
+# -----------------
+# Evaluate to an m4 number equal to the maximum number of commands to put
+# in any single sed program.
+#
+# Some seds have small command number limits, like on Digital OSF/1 and HP-UX.
+m4_define([_AC_SED_CMD_LIMIT],
+dnl One cannot portably go further than 100 commands because of HP-UX.
+[100])
 
 # _AC_OUTPUT_FILES
 # ----------------
@@ -860,80 +868,140 @@
 # It has to send itself into $CONFIG_STATUS (eg, via here documents).
 # Upon exit, no here document shall be opened.
 m4_define([_AC_OUTPUT_FILES],
-[cat >>$CONFIG_STATUS <<_ACEOF
-
+[cat >>$CONFIG_STATUS <<\_ACEOF
 #
 # CONFIG_FILES section.
 #
 
 # No need to generate the scripts if there are no CONFIG_FILES.
 # This happens for instance when ./config.status config.h
-if test -n "\$CONFIG_FILES"; then
-  # Protect against being on the right side of a sed subst in config.status.
-dnl Please, pay attention that this sed code depends a lot on the shape
-dnl of the sed commands issued by AC_SUBST.  So if you change one, change
-dnl the other too.
-[  sed 's/,@/@@/; s/@,/@@/; s/,;t t\$/@;t t/; /@;t t\$/s/[\\\\&,]/\\\\&/g;
-   s/@@/,@/; s/@@/@,/; s/@;t t\$/,;t t/' >\$tmp/subs.sed <<\\CEOF]
-dnl These here document variables are unquoted when configure runs
-dnl but quoted when config.status runs, so variables are expanded once.
-dnl Insert the sed substitutions of variables.
+if test -n "$CONFIG_FILES"; then
+
+_ACEOF
+
+m4_pushdef([_AC_SED_CMDS], [])dnl
+m4_pushdef([_AC_SED_FRAG_NUM], 0)dnl Fragment number.
+m4_pushdef([_AC_SED_LINES], 0)dnl Number of lines in current fragment so far.
+m4_pushdef([_AC_SED_LINES_LIMIT], [])dnl Max lines to put in each fragment.
+dnl
 m4_ifdef([_AC_SUBST_VARS],
-	 [AC_FOREACH([AC_Var], m4_defn([_AC_SUBST_VARS]),
-[s,@AC_Var@,$AC_Var,;t t
+[# Create sed programs to substitute non-file output variables.
+
+m4_define([_AC_SED_LINES_LIMIT], m4_eval((_AC_SED_CMD_LIMIT-2)/2))dnl
+# Init the delimiter to something very unlikely.
+ac_delim='@!_!#_'
+
+AC_FOREACH([_AC_Var], m4_defn([_AC_SUBST_VARS])[ @END@],
+[m4_if(_AC_Var, [@END@],
+[dnl @END@ marker is here just to end last fragment.
+m4_if(_AC_SED_LINES, 0, [],dnl Last segment already ended.
+dnl Trigger fake end of frag, without losing number of lines in it.
+[m4_define([_AC_SED_LINES_LIMIT],_AC_SED_LINES)])],
+dnl Not at @END@; actually do something.
+[dnl Start new fragment if needed.
+m4_if(_AC_SED_LINES, 0,
+[dnl Increment fragment number.
+m4_define([_AC_SED_FRAG_NUM],m4_eval(_AC_SED_FRAG_NUM+1))dnl
+dnl Record that this frament will need to be used.
+m4_define([_AC_SED_CMDS],
+m4_defn([_AC_SED_CMDS])[| sed -f $tmp/subs-]_AC_SED_FRAG_NUM[.sed ])dnl 
+dnl Begin constructing the fragment.
+[while :; do
+  # Store some of the output variables in a file where they can be turned into
+  # a sed program that config.status will use.
+  cat >conf$$subs.sed <<_ACEOF
+]])dnl New fragment is started.
+$ac_delim<_AC_Var>$ac_delim$_AC_Var$ac_delim
+m4_define([_AC_SED_LINES], m4_incr(_AC_SED_LINES))dnl Increment line.
+])dnl
+dnl End fragment if needed.
+m4_if(_AC_SED_LINES, _AC_SED_LINES_LIMIT,
+[_ACEOF
+  # Make certain that only the expected number of $ac_delim's have been output
+  # into the program.  If there is a different number, the delimiter has
+  # appeared in one of the output variables, and this is sure to confuse
+  # something, so change the delimiter and generate all the sed program
+  # fragments again.
+dnl Note that grep -c doesn't do the right thing because it counts lines
+dnl with matches, not total number of matches.
+  if test `sed -n '
+:d
+s/'"$ac_delim"'//; t i
+$!b
+dnl This can't be looking for more than (_AC_SED_CMD_LIMIT-2)/2*3, which is
+dnl plenty small enough to not trip any line length limits.
+x; s/x\{m4_eval(_AC_SED_LINES*3)\}/yes/; s/yesx+/no/; /^yes$/!s/.*/no/; p; q
+:i
+x; s/$/x/; x; t d
+' < conf$$subs.sed
+` != yes; then
+    ac_delim=$ac_delim'_'
+  else break; fi
+done
+# Have config.status create the needed sed program.
+cat >>$CONFIG_STATUS <<\_ACEOF
+  cat >$tmp/subs-_AC_SED_FRAG_NUM.sed <<\CEOF
+[:t
+/@[a-zA-Z_][a-zA-Z_0-9]*@/!b
+_ACEOF
+# Output the sed program verbatim to config.status, properly escaping its
+# contents as needed.  Note that this escaping is now safe because $ac_delim
+# contains none of [[\\&,]] and occurs only where it was inserted above.
+sed '
+s/[\\&,]/\\&/g
+s/$/\\/
+s/'"$ac_delim"'</s,@/
+s/>'"$ac_delim"'/@,/
+s/'"$ac_delim"'\\$/,; t t/
+' <conf$$subs.sed >>$CONFIG_STATUS
+cat >>$CONFIG_STATUS <<\_ACEOF
+]CEOF
+_ACEOF
+
+m4_define([_AC_SED_LINES], 0)dnl
 ])])dnl
+
+])
+
 m4_ifdef([_AC_SUBST_FILES],
-	 [AC_FOREACH([AC_Var], m4_defn([_AC_SUBST_FILES]),
-[/@AC_Var@/r $AC_Var
-s,@AC_Var@,,;t t
-])])dnl
-CEOF
+[# Create sed programs to substitute non-file output variables.
 
-_ACEOF
+m4_define([_AC_SED_LINES_LIMIT], m4_eval((_AC_SED_CMD_LIMIT-2)/2))dnl
+AC_FOREACH([_AC_Var], m4_defn([_AC_SUBST_FILES])[ @END@],
+[m4_if(_AC_Var, [@END@],
+[dnl @END@ marker is here just to end last fragment.
+m4_if(_AC_SED_LINES, 0, [],dnl Last segment already ended.
+      dnl Trigger fake end of frag, without losing number of lines in it.
+      [m4_define([_AC_SED_LINES_LIMIT],_AC_SED_LINES)])],
+dnl Not at @END@; actually do something.
+[dnl Start new fragment if needed.
+m4_if(_AC_SED_LINES, 0,
+[dnl Increment fragment number.
+m4_define([_AC_SED_FRAG_NUM],m4_eval(_AC_SED_FRAG_NUM+1))dnl
+dnl Record that this frament will need to be used.
+m4_define([_AC_SED_CMDS],
+m4_defn([_AC_SED_CMDS])[| sed -f $tmp/subs-]_AC_SED_FRAG_NUM[.sed ])dnl 
+dnl Begin constructing the fragment.
+[  cat >>$CONFIG_STATUS <<_ACEOF
+/@[a-zA-Z_][a-zA-Z_0-9]*@/!b
+]])dnl New fragment is started.
+/@AC_Var@/r $AC_Var
+s,@AC_Var@,,;t t
+m4_define([_AC_SED_LINES], m4_incr(_AC_SED_LINES))dnl Increment line.
+])dnl
+m4_if(_AC_SED_LINES, _AC_SED_LINES_LIMIT,
+[_ACEOF
 
-  cat >>$CONFIG_STATUS <<\_ACEOF
-  # Split the substitutions into bite-sized pieces for seds with
-  # small command number limits, like on Digital OSF/1 and HP-UX.
-dnl One cannot portably go further than 100 commands because of HP-UX.
-dnl Here, there are 2 cmd per line, and two cmd are added later.
-  ac_max_sed_lines=48
-  ac_sed_frag=1 # Number of current file.
-  ac_beg=1 # First line for current file.
-  ac_end=$ac_max_sed_lines # Line after last line for current file.
-  ac_more_lines=:
-  ac_sed_cmds=
-  while $ac_more_lines; do
-    if test $ac_beg -gt 1; then
-      sed "1,${ac_beg}d; ${ac_end}q" $tmp/subs.sed >$tmp/subs.frag
-    else
-      sed "${ac_end}q" $tmp/subs.sed >$tmp/subs.frag
-    fi
-    if test ! -s $tmp/subs.frag; then
-      ac_more_lines=false
-    else
-      # The purpose of the label and of the branching condition is to
-      # speed up the sed processing (if there are no `@' at all, there
-      # is no need to browse any of the substitutions).
-      # These are the two extra sed commands mentioned above.
-      (echo [':t
-  /@[a-zA-Z_][a-zA-Z_0-9]*@/!b'] && cat $tmp/subs.frag) >$tmp/subs-$ac_sed_frag.sed
-      if test -z "$ac_sed_cmds"; then
-	ac_sed_cmds="sed -f $tmp/subs-$ac_sed_frag.sed"
-      else
-	ac_sed_cmds="$ac_sed_cmds | sed -f $tmp/subs-$ac_sed_frag.sed"
-      fi
-      ac_sed_frag=`expr $ac_sed_frag + 1`
-      ac_beg=$ac_end
-      ac_end=`expr $ac_end + $ac_max_sed_lines`
-    fi
-  done
-  if test -z "$ac_sed_cmds"; then
-    ac_sed_cmds=cat
-  fi
+m4_define([_AC_SED_LINES], 0)dnl
+])])])dnl
+dnl
+m4_popdef([_AC_SED_FRAG_NUM])dnl
+m4_popdef([_AC_SED_LINES])dnl
+m4_popdef([_AC_SED_LINES_LIMIT])dnl
+dnl
+cat >>$CONFIG_STATUS <<\_ACEOF
 fi # test -n "$CONFIG_FILES"
 
-_ACEOF
-cat >>$CONFIG_STATUS <<\_ACEOF
 for ac_file in : $CONFIG_FILES; do test "x$ac_file" = x: && continue
   # Support "outfile[:infile[:infile...]]", defaulting infile="outfile.in".
   case $ac_file in
@@ -1018,8 +1086,8 @@
 s,@abs_top_builddir@,$ac_abs_top_builddir,;t t
 AC_PROVIDE_IFELSE([AC_PROG_INSTALL], [s,@INSTALL@,$ac_INSTALL,;t t
 ])dnl
-dnl The parens around the eval prevent an "illegal io" in Ultrix sh.
-" $ac_file_inputs | (eval "$ac_sed_cmds") >$tmp/out
+" $ac_file_inputs m4_defn([_AC_SED_CMDS])>$tmp/out
+m4_popdef([_AC_SED_CMDS])dnl
   rm -f $tmp/stdin
 dnl This would break Makefile dependencies.
 dnl  if diff $ac_file $tmp/out >/dev/null 2>&1; then
_______________________________________________
Autoconf mailing list
Autoconf@xxxxxxx
http://lists.gnu.org/mailman/listinfo/autoconf

[Index of Archives]     [GCC Help]     [Kernel Discussion]     [RPM Discussion]     [Red Hat Development]     [Yosemite News]     [Linux USB]     [Samba]

  Powered by Linux