> I have an HTML document at home (the Firefox bookmarks > output) that I was trying to parse this morning, with many > links as such: > <a href="http://www.aximsite.com/articles/link.php?id=22" > add_date="1130275531" last_charset="windows-1252" > id="rdf:#$FOQot">Knowledge Base: What Can I Do With My Axim?</a> > > I want to parse it to remove the attributes add_date, > last_charset, id, and others that are in other entries. The > text of the file it produces is 22 kb but the HTML is over 500 kb! > > I don't have the code with me that I was trying (I'm not at > home now- but it has been nagging me all day), but I was > running into problems and could NOT get it to just remove the > attributes. One regex solution left me with <a></a> and > others with <>, <a href="http://address" > ="something" ="somehing else">blahblah</a>, etc... > > How can I get it to remove, say, attribute xxx and the ="something" > that follows it? In all fairness I am not good at regexes and > need to practice, but I would appreciate any help I can get- > I'm really bogged down with studies and simply cannot devote > a full day to this 'trivial' excercise. I'd forget regexps and use SAX style parser. Trivial then to remove unwanted attributes. Jared -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php