Hi Eric,
On 2 Oct 2008, at 10:51, Eric Blake wrote:
Is there any portable way to process files that contain NUL bytes?
None that I'm aware of. Many GNU utilities are reasonably well
behaved with respect to '\0', and m4 is unusual to some extent in that
we don't handle them well ourselves.
I'm working on making m4 1.6 transparently handle NUL,
Excellent! I made an attempt to do that myself on the 2.0 branch some
years ago, but it didn't go well so I never committed...
and want to
post-process the output to normalize error messages while still
verifying
that NUL bytes appeared where expected on stderr. But on Solaris, the
native sed strips NUL bytes before processing the line (NUL bytes
cannot
appear in text files, and POSIX does not define behavior on non-text
files, so this is not a bug, just a difference from GNU diff). As a
result, the m4 testsuite either fails (if I only postprocess the
captured
stderr and not the expected error) or can have false positives (if
both
stderr and expected error are normalized, then regressions involving
added
or missing NUL are not detected). I don't want to require perl for
just
this one test; m4 seems fundamental enough to keep the testsuite
restricted to the GNU coding standards set of tools.
I'd be inclined to do that in C. A few lines should be sufficient to
write a minimal filter that writes '\' '0' or '^' '@' to output
whenever a NUL byte arrives?
The Solaris man
pages mention that /usr/xpg4/bin/tr can handle NUL bytes, but not
/usr/bin/tr; maybe I could search for an adequate tr, and change all
NUL
to some other byte that does not otherwise appear in my expected
output
(with the added benefit that diff might not give up early with the
complaint that the files are binary), but I don't know if that is
portable
either.
It's probably a safe bet that whatever vendor tool you rely on to
postprocess will do the wrong thing on one machine or another :(
Any suggestions? Is this worth documenting in the autoconf manual?
Certainly, especially since many of the GNU tools *do* endeavour to
handle '\0' input gracefully.
Cheers,
Gary
--
Email me: gary@xxxxxxx (\(\
Read my blog: http://blog.azazil.net ( o.O)
And my other blog: http://www.machaxor.net (uu )o
...and my book: http://sources.redhat.com/autobook ("("_)
_______________________________________________
Autoconf mailing list
Autoconf@xxxxxxx
http://lists.gnu.org/mailman/listinfo/autoconf