On 03.04.2010 17:17, Ashley Sheridan wrote: > On Sat, 2010-04-03 at 17:03 +0200, dispy wrote: > > >> Am 03.04.2010 16:29, schrieb tedd: >> >>> Hi gang: >>> >>> Here's the problem. >>> >>> I have 184 HTML pages in a directory and each page contain a question. >>> The question is noted in the HTML DOM like so: >>> >>> <p class="question"> >>> Who is Roger Rabbit? >>> </p> >>> >>> My question is -- how can I extract the string "Who is Roger Rabbit?" >>> from each page using php? You see, I want to store the questions in a >>> database without having to re-type, or cut/paste, each one. >>> >>> Now, I can extract each question by using javascript -- >>> >>> document.getElementById("question").innerHTML; >>> >>> -- and stepping through each page, but I don't want to use javascript >>> for this. >>> >>> I have not found/created a working example of this using PHP. I tried >>> using PHP's getElementByID(), but that requires the target file to be >>> valid xml and the string to be contained within an ID and not a class. >>> These pages do not support either requirement. >>> >>> Additionally, I realize that I can load the files and parse out what is >>> between the <p> tags, but I was hoping for a "GetElementByClass" way to >>> do this. >>> >>> So, is there one? >>> >>> Thanks, >>> >>> tedd >>> >> Why don't you just use REGEX? I don't know any possibility to easily >> process contents which are not valid XML/XHTML just because there's no >> library to load such stuff (but put me in right there). >> >> I'm not an expert of REGEX, but I think the following would do it: >> /\<p\s*class\=\"question\"\s*\>(.*)\<\/p\> >> >> >> (my first contribute here, I beg your pardon if something went wrong) >> >> Regards, >> >> Valentin Dreismann >> >> > > The . won't match new line characters, so you'll have to add those in > too. > > Thanks, > Ash > http://www.ashleysheridan.co.uk > It matches new lines with the modifier s. http://ch2.php.net/manual/en/reference.pcre.pattern.modifiers.php Greetz Piero -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php