I use this: http://simplehtmldom.sourceforge.net/ Check it out. Thanks, Vikash Kumar -- http://vika.sh On Sat, Apr 3, 2010 at 8:28 PM, Ashley Sheridan <ash@xxxxxxxxxxxxxxxxxxxx>wrote: > On Sat, 2010-04-03 at 10:29 -0400, tedd wrote: > > > Hi gang: > > > > Here's the problem. > > > > I have 184 HTML pages in a directory and each page contain a > > question. The question is noted in the HTML DOM like so: > > > > <p class="question"> > > Who is Roger Rabbit? > > </p> > > > > My question is -- how can I extract the string "Who is Roger Rabbit?" > > from each page using php? You see, I want to store the questions in a > > database without having to re-type, or cut/paste, each one. > > > > Now, I can extract each question by using javascript -- > > > > document.getElementById("question").innerHTML; > > > > -- and stepping through each page, but I don't want to use javascript for > this. > > > > I have not found/created a working example of this using PHP. I tried > > using PHP's getElementByID(), but that requires the target file to be > > valid xml and the string to be contained within an ID and not a > > class. These pages do not support either requirement. > > > > Additionally, I realize that I can load the files and parse out what > > is between the <p> tags, but I was hoping for a "GetElementByClass" > > way to do this. > > > > So, is there one? > > > > Thanks, > > > > tedd > > -- > > ------- > > http://sperling.com http://ancientstones.com http://earthstones.com > > > > > I don't think there is a getElementsByClass function. HTML5 is proposing > one, but that will most likely be implemented in Javascript before PHP > Dom. There is a way to tidy up the HTML to make it XHTML, but I'm not > sure what it is. If you know roughly where in the document the HTML > snippet is you can use XPath to grab it. > > Failing that, what about a regex? It shouldn't be too hard to write a > regex to match your example above. > > Thanks, > Ash > http://www.ashleysheridan.co.uk > > >