Re: GetElementByClass?

dispy <dispyfree@xxxxxxxxxxxxxx> · Sat, 03 Apr 2010 17:03:51 +0200

Am 03.04.2010 16:29, schrieb tedd:
> Hi gang:
> 
> Here's the problem.
> 
> I have 184 HTML pages in a directory and each page contain a question.
> The question is noted in the HTML DOM like so:
> 
> <p class="question">
>   Who is Roger Rabbit?
> </p>
> 
> My question is -- how can I extract the string "Who is Roger Rabbit?"
> from each page using php? You see, I want to store the questions in a
> database without having to re-type, or cut/paste, each one.
> 
> Now, I can extract each question by using javascript --
> 
> document.getElementById("question").innerHTML;
> 
> -- and stepping through each page, but I don't want to use javascript
> for this.
> 
> I have not found/created a working example of this using PHP. I tried
> using PHP's getElementByID(), but that requires the target file to be
> valid xml and the string to be contained within an ID and not a class.
> These pages do not support either requirement.
> 
> Additionally, I realize that I can load the files and parse out what is
> between the <p> tags, but I was hoping for a "GetElementByClass" way to
> do this.
> 
> So, is there one?
> 
> Thanks,
> 
> tedd

Why don't you just use REGEX? I don't know any possibility to easily
process contents which are not valid XML/XHTML just because there's no
library to load such stuff (but put me in right there).

I'm not an expert of REGEX, but I think the following would do it:
/\<p\s*class\=\"question\"\s*\>(.*)\<\/p\>

(my first contribute here, I beg your pardon if something went wrong)

Regards,

Valentin Dreismann

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php