On 2021-10-14 06:41:30 E. Liddell wrote: > On Wed, 13 Oct 2021 16:02:14 -0500 > > J Leslie Turriff <jlturriff@xxxxxxxx> wrote: > > On 2021-10-13 13:07:13 E. Liddell wrote: > > > On Wed, 13 Oct 2021 16:46:20 +0000 > > > > > > That being said, test 9 is a raw grep being performed on an XML file. > > > This means that it could easily be latching onto something in a > > > comment, because following the full XML spec for determining whether a > > > given line is inside a comment or not using a simple text-matching tool > > > is . . . well, let's say it isn't something I'd want to try, and I deal > > > in regexes a fair amount in my day job. It really needs to be run > > > through a full parser that constructs a DOM tree. > > > > Filter to throw away comments first, then filter for what it should look > > for. > > Correctly throwing away comments isn't as simple as tossing away everything > between a start marker and an end marker, though, because if the comment > marker is inside a CDATA section, it doesn't actually affect whether or not > the text is a comment. I suspect a comment marker found between quotes in > a text-format attribute value doesn't count either, but I'd have to check > the spec to be sure. And there may be more quirks that I've forgotten. > (Oh, and you could *easily* embed the value the grep expression is looking > for in the file without triggering the grep by using CDATA, now that I > think about it.) > > There's a reason that man perlfaq6 contains the following: > > How do I match XML, HTML, or other nasty, ugly things with a regex? > Do not use regexes. Use a module and forget about the regular > expressions. > > E. Liddell Yeah. I know little about it, but IIRC, XML was supposed to make everything so much easier... :-D Leslie -- Operating System: Linux Distribution: openSUSE Leap 15.3 x86_64 Desktop Environment: Trinity Qt: 3.5.0 TDE: R14.0.10 tde-config: 1.0 ____________________________________________________ tde-users mailing list -- users@xxxxxxxxxxxxxxxxxx To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxx Web mail archive available at https://mail.trinitydesktop.org/mailman3/hyperkitty/list/users@xxxxxxxxxxxxxxxxxx