Well, this took me longer to get to than I had hoped, but it came out pretty well: I've rewritten my solution from scratch in order to clean it up and document it. Find the diff from 2.59b attached. Some things are worth nothing: I've made a new macro that is just the max number of sed commands that can be safely used, and written things in terms of it. I realized that not only needn't config.status do the job of breaking the sed program up into fragments, it needn't do the job of escaping the results either. Now, at ./configure time, the exact sed program fragments, fully escaped, are output into verbatim here documents (i.e. ones whose terminator is quoted) in config.status. It was suggested that grep -c be used to make sure that no extra delimiters were found in the sed program. grep -c counts lines with matches, not actual matches, so I wrote a wacky sed script to do the job. Does somebody have a better portable solution to this? Rather than counting the delimiters to just _notice_ when an output variable containing the delimiter would foul up the escaping mechanism, I use it to instead modify the delimiter and redo the whole process. It's now guaranteed to always work, regardless of the contents of the variables. None of the escaping rigamarole is needed for _AC_SUBST_FILES, since the values of such output variables don't end up inside of sed s///. I therefore don't escape them at all. Note that this means that if an AC_SUBST_FILE'd variable yielded a filename with a comma or backslash in it, the sed script now does not have those characters escaped. Is this a problem? Did the old behavior even yield valid sed code in those rare cases where such a value resulted? There is one sed program that is applied prior to those generated to deal with output variables. These deal with things like @srcdir@. I have left this entirely unchanged. Since these things probably should never be able to have multiline values, I figure this is no loss. There are two more important issues with this code, which I haven't addressed in this patch: AC_SUBST_FILE: If you AC_SUBST_FILE(foo) and AC_SUBST_FILE(bar), then an input file with a line with "@bar@@foo@" can generate the contents of the two files in either order, depending on order in which @foo@ and @bar@ are interpolated. I think the current behavior is to first output the file for the variable first AC_SUBST_FILE'ed, which may well be a different order than that in which the output variables appear in the input file. This seems like a bug to me. I tried to figure a way to interpolate the variables in the order that they appear, but I think this is impossible with portable sed code unless you're willing to insert some spurious newlines around the instances of the output variables (clearly not acceptible). On the other hand, does anyone actually use AC_SUBST_FILE'd variables in any way except to put them on a line by themselves? Note that at least some seds (perhaps all?) actually insert the file entirely before the line with the output variable. So "fish\nbait@bar@shop" becomes "fish\nfile\nbaitshop", not "fish\nbaitfile\nshop". This seems sufficiently wacky to me that I expect no one uses it this way. If indeed everybody uses these on lines by themselves, could we require that? This would have the advantage (perhaps small) that the newline following the output variable could be deleted. This is the behavior I would expect if I had an output variable set to /dev/null, and in any other case, the file will provide its own terminal newline. Recursive output variable: If there are less than 48 output variables, they are all recursively expanded. That is if shell variable foo is the string "@bar@", the generated file ultimately holds the value of bar, not "@bar@". This is perhaps desirable and perhaps not. It does of course make it possible to form loops that cause generation of the output file to never complete. However, if more output variables are defined, then more than one sed program is needed to apply all the interpolations. If the first program contains the definition of @bar@ above, and the second one has @foo@, now @foo@ is _not_ recursively interpolated. Again, it's probaly fine to not recursively interpolate, but we now have two different behaviors, depending not on choices about the variables, but on something far more obscure, and not documented for the autoconf user (The 48-variable limit is a detail of how _AC_OUTPUT_FILES is implemented.). It is possible, but irritating to always recursively interpolate: the file is generated from its inputs by applying all the sed programs. This result then has all the sed programs applied again. If the result of the second application is the same as the first, interpolation is complete; otherwise, the second result replaces the first and the programs are applied again and again until the results change no more. This scheme can also allow the contents of files included by AC_SUBST_FILE to have output variables interpolated. Precluding recursively interpolation seems more difficult, since this requires either that sed be used to somehow only process the unprocessed portion of each line (a moderate pain in the rear made vastly worse by my just-added support of multiline output variables) or that the values of output variables be escaped. Quadrigraph processing might be an obvious means of the latter, but it is ineffective. Consider that changing "@foo@" to "@@&t@foo@&t@@" still leaves "@foo@" as a substring, and "@f@&t@o@&t@o@" leaves "@f@" and "@o@" as substrings. It would seem that a new syntax would be needed, such as "@foo@" -> "@@=f@@=o@@=o@@". However, even with an effective escape mechanisim, those escapes would need to be applied to every @ character in the output variable values. Note also that such an escape could not start or end with @. Consider "@foo@nonvar@" with "@foo" -> "no@var" and "@varnonvar@" -> something else. I don't have a solution to this, other than the current one of ignoring it until somebody actually has a problem. At the very least however, it should be documented that the behavior can be very unpredictible. -Dan
--- status.m4.old 2004-08-20 11:28:22.000000000 -0400 +++ status.m4 2005-01-24 15:57:04.296915200 -0500 @@ -850,7 +850,15 @@ m4_define([AC_LIST_FILES]) m4_define([AC_LIST_FILES_COMMANDS]) - +# _AC_SED_CMD_LIMIT +# ----------------- +# Evaluate to an m4 number equal to the maximum number of commands to put +# in any single sed program. +# +# Some seds have small command number limits, like on Digital OSF/1 and HP-UX. +m4_define([_AC_SED_CMD_LIMIT], +dnl One cannot portably go further than 100 commands because of HP-UX. +[100]) # _AC_OUTPUT_FILES # ---------------- @@ -860,80 +868,140 @@ # It has to send itself into $CONFIG_STATUS (eg, via here documents). # Upon exit, no here document shall be opened. m4_define([_AC_OUTPUT_FILES], -[cat >>$CONFIG_STATUS <<_ACEOF - +[cat >>$CONFIG_STATUS <<\_ACEOF # # CONFIG_FILES section. # # No need to generate the scripts if there are no CONFIG_FILES. # This happens for instance when ./config.status config.h -if test -n "\$CONFIG_FILES"; then - # Protect against being on the right side of a sed subst in config.status. -dnl Please, pay attention that this sed code depends a lot on the shape -dnl of the sed commands issued by AC_SUBST. So if you change one, change -dnl the other too. -[ sed 's/,@/@@/; s/@,/@@/; s/,;t t\$/@;t t/; /@;t t\$/s/[\\\\&,]/\\\\&/g; - s/@@/,@/; s/@@/@,/; s/@;t t\$/,;t t/' >\$tmp/subs.sed <<\\CEOF] -dnl These here document variables are unquoted when configure runs -dnl but quoted when config.status runs, so variables are expanded once. -dnl Insert the sed substitutions of variables. +if test -n "$CONFIG_FILES"; then + +_ACEOF + +m4_pushdef([_AC_SED_CMDS], [])dnl +m4_pushdef([_AC_SED_FRAG_NUM], 0)dnl Fragment number. +m4_pushdef([_AC_SED_LINES], 0)dnl Number of lines in current fragment so far. +m4_pushdef([_AC_SED_LINES_LIMIT], [])dnl Max lines to put in each fragment. +dnl m4_ifdef([_AC_SUBST_VARS], - [AC_FOREACH([AC_Var], m4_defn([_AC_SUBST_VARS]), -[s,@AC_Var@,$AC_Var,;t t +[# Create sed programs to substitute non-file output variables. + +m4_define([_AC_SED_LINES_LIMIT], m4_eval((_AC_SED_CMD_LIMIT-2)/2))dnl +# Init the delimiter to something very unlikely. +ac_delim='@!_!#_' + +AC_FOREACH([_AC_Var], m4_defn([_AC_SUBST_VARS])[ @END@], +[m4_if(_AC_Var, [@END@], +[dnl @END@ marker is here just to end last fragment. +m4_if(_AC_SED_LINES, 0, [],dnl Last segment already ended. +dnl Trigger fake end of frag, without losing number of lines in it. +[m4_define([_AC_SED_LINES_LIMIT],_AC_SED_LINES)])], +dnl Not at @END@; actually do something. +[dnl Start new fragment if needed. +m4_if(_AC_SED_LINES, 0, +[dnl Increment fragment number. +m4_define([_AC_SED_FRAG_NUM],m4_eval(_AC_SED_FRAG_NUM+1))dnl +dnl Record that this frament will need to be used. +m4_define([_AC_SED_CMDS], +m4_defn([_AC_SED_CMDS])[| sed -f $tmp/subs-]_AC_SED_FRAG_NUM[.sed ])dnl +dnl Begin constructing the fragment. +[while :; do + # Store some of the output variables in a file where they can be turned into + # a sed program that config.status will use. + cat >conf$$subs.sed <<_ACEOF +]])dnl New fragment is started. +$ac_delim<_AC_Var>$ac_delim$_AC_Var$ac_delim +m4_define([_AC_SED_LINES], m4_incr(_AC_SED_LINES))dnl Increment line. +])dnl +dnl End fragment if needed. +m4_if(_AC_SED_LINES, _AC_SED_LINES_LIMIT, +[_ACEOF + # Make certain that only the expected number of $ac_delim's have been output + # into the program. If there is a different number, the delimiter has + # appeared in one of the output variables, and this is sure to confuse + # something, so change the delimiter and generate all the sed program + # fragments again. +dnl Note that grep -c doesn't do the right thing because it counts lines +dnl with matches, not total number of matches. + if test `sed -n ' +:d +s/'"$ac_delim"'//; t i +$!b +dnl This can't be looking for more than (_AC_SED_CMD_LIMIT-2)/2*3, which is +dnl plenty small enough to not trip any line length limits. +x; s/x\{m4_eval(_AC_SED_LINES*3)\}/yes/; s/yesx+/no/; /^yes$/!s/.*/no/; p; q +:i +x; s/$/x/; x; t d +' < conf$$subs.sed +` != yes; then + ac_delim=$ac_delim'_' + else break; fi +done +# Have config.status create the needed sed program. +cat >>$CONFIG_STATUS <<\_ACEOF + cat >$tmp/subs-_AC_SED_FRAG_NUM.sed <<\CEOF +[:t +/@[a-zA-Z_][a-zA-Z_0-9]*@/!b +_ACEOF +# Output the sed program verbatim to config.status, properly escaping its +# contents as needed. Note that this escaping is now safe because $ac_delim +# contains none of [[\\&,]] and occurs only where it was inserted above. +sed ' +s/[\\&,]/\\&/g +s/$/\\/ +s/'"$ac_delim"'</s,@/ +s/>'"$ac_delim"'/@,/ +s/'"$ac_delim"'\\$/,; t t/ +' <conf$$subs.sed >>$CONFIG_STATUS +cat >>$CONFIG_STATUS <<\_ACEOF +]CEOF +_ACEOF + +m4_define([_AC_SED_LINES], 0)dnl ])])dnl + +]) + m4_ifdef([_AC_SUBST_FILES], - [AC_FOREACH([AC_Var], m4_defn([_AC_SUBST_FILES]), -[/@AC_Var@/r $AC_Var -s,@AC_Var@,,;t t -])])dnl -CEOF +[# Create sed programs to substitute non-file output variables. -_ACEOF +m4_define([_AC_SED_LINES_LIMIT], m4_eval((_AC_SED_CMD_LIMIT-2)/2))dnl +AC_FOREACH([_AC_Var], m4_defn([_AC_SUBST_FILES])[ @END@], +[m4_if(_AC_Var, [@END@], +[dnl @END@ marker is here just to end last fragment. +m4_if(_AC_SED_LINES, 0, [],dnl Last segment already ended. + dnl Trigger fake end of frag, without losing number of lines in it. + [m4_define([_AC_SED_LINES_LIMIT],_AC_SED_LINES)])], +dnl Not at @END@; actually do something. +[dnl Start new fragment if needed. +m4_if(_AC_SED_LINES, 0, +[dnl Increment fragment number. +m4_define([_AC_SED_FRAG_NUM],m4_eval(_AC_SED_FRAG_NUM+1))dnl +dnl Record that this frament will need to be used. +m4_define([_AC_SED_CMDS], +m4_defn([_AC_SED_CMDS])[| sed -f $tmp/subs-]_AC_SED_FRAG_NUM[.sed ])dnl +dnl Begin constructing the fragment. +[ cat >>$CONFIG_STATUS <<_ACEOF +/@[a-zA-Z_][a-zA-Z_0-9]*@/!b +]])dnl New fragment is started. +/@AC_Var@/r $AC_Var +s,@AC_Var@,,;t t +m4_define([_AC_SED_LINES], m4_incr(_AC_SED_LINES))dnl Increment line. +])dnl +m4_if(_AC_SED_LINES, _AC_SED_LINES_LIMIT, +[_ACEOF - cat >>$CONFIG_STATUS <<\_ACEOF - # Split the substitutions into bite-sized pieces for seds with - # small command number limits, like on Digital OSF/1 and HP-UX. -dnl One cannot portably go further than 100 commands because of HP-UX. -dnl Here, there are 2 cmd per line, and two cmd are added later. - ac_max_sed_lines=48 - ac_sed_frag=1 # Number of current file. - ac_beg=1 # First line for current file. - ac_end=$ac_max_sed_lines # Line after last line for current file. - ac_more_lines=: - ac_sed_cmds= - while $ac_more_lines; do - if test $ac_beg -gt 1; then - sed "1,${ac_beg}d; ${ac_end}q" $tmp/subs.sed >$tmp/subs.frag - else - sed "${ac_end}q" $tmp/subs.sed >$tmp/subs.frag - fi - if test ! -s $tmp/subs.frag; then - ac_more_lines=false - else - # The purpose of the label and of the branching condition is to - # speed up the sed processing (if there are no `@' at all, there - # is no need to browse any of the substitutions). - # These are the two extra sed commands mentioned above. - (echo [':t - /@[a-zA-Z_][a-zA-Z_0-9]*@/!b'] && cat $tmp/subs.frag) >$tmp/subs-$ac_sed_frag.sed - if test -z "$ac_sed_cmds"; then - ac_sed_cmds="sed -f $tmp/subs-$ac_sed_frag.sed" - else - ac_sed_cmds="$ac_sed_cmds | sed -f $tmp/subs-$ac_sed_frag.sed" - fi - ac_sed_frag=`expr $ac_sed_frag + 1` - ac_beg=$ac_end - ac_end=`expr $ac_end + $ac_max_sed_lines` - fi - done - if test -z "$ac_sed_cmds"; then - ac_sed_cmds=cat - fi +m4_define([_AC_SED_LINES], 0)dnl +])])])dnl +dnl +m4_popdef([_AC_SED_FRAG_NUM])dnl +m4_popdef([_AC_SED_LINES])dnl +m4_popdef([_AC_SED_LINES_LIMIT])dnl +dnl +cat >>$CONFIG_STATUS <<\_ACEOF fi # test -n "$CONFIG_FILES" -_ACEOF -cat >>$CONFIG_STATUS <<\_ACEOF for ac_file in : $CONFIG_FILES; do test "x$ac_file" = x: && continue # Support "outfile[:infile[:infile...]]", defaulting infile="outfile.in". case $ac_file in @@ -1018,8 +1086,8 @@ s,@abs_top_builddir@,$ac_abs_top_builddir,;t t AC_PROVIDE_IFELSE([AC_PROG_INSTALL], [s,@INSTALL@,$ac_INSTALL,;t t ])dnl -dnl The parens around the eval prevent an "illegal io" in Ultrix sh. -" $ac_file_inputs | (eval "$ac_sed_cmds") >$tmp/out +" $ac_file_inputs m4_defn([_AC_SED_CMDS])>$tmp/out +m4_popdef([_AC_SED_CMDS])dnl rm -f $tmp/stdin dnl This would break Makefile dependencies. dnl if diff $ac_file $tmp/out >/dev/null 2>&1; then
_______________________________________________ Autoconf mailing list Autoconf@xxxxxxx http://lists.gnu.org/mailman/listinfo/autoconf