RE: Using DOM textContent Property

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nathan,
 
Thanks for the suggestion, but it's still not working for me.  Here's my
code:

=========== 
$HTML = new DOMDocument();
@$HTML->loadHTML($text);
$Elements = $HTML->getElementsByTagName("*");

for ($X = 0; $X < $Elements->length; $X++) {
  $Element =  $Elements->item($X);

  if ($Element->tagName == "a") {
    # SNIP - Do something with A tags here
  } else if ($Element instanceof DOMText) {
    echo $Element->nodeValue; exit;
  }
}
=========== 

This loop never executes the instanceof part of the code.  If I add:

  } else if ($Element instanceof DOMNode) {
    echo "foo!"; exit;
  }

Then it echos "foo!" as expected.  It just seems that none of the nodes in
the tree are DOMText nodes.  In fact, get_class($Element) returns
"DOMElement" for every node in the tree.

Tim Gustafson
SOE Webmaster
UC Santa Cruz
tjg@xxxxxxxxxxxx
831-459-5354



 


________________________________

	From: Nathan Nobbe [mailto:quickshiftin@xxxxxxxxx] 
	Sent: Wednesday, September 03, 2008 11:55 AM
	To: Tim Gustafson
	Cc: php@xxxxxxx; php-general@xxxxxxxxxxxxx
	Subject: Re:  Using DOM textContent Property
	
	
	On Wed, Sep 3, 2008 at 10:03 AM, Tim Gustafson <tjg@xxxxxxxxxxxx>
wrote:
	

		> I think you might be better off using regexp on the text
		> *before* sending it through the DOM parser. Send the
		> user's text through a function that searches for URLs
		> and email addresses, creating proper links as they're
		> found, then use the output from that to move on to your
		> DOM stuff. That way, you need not create new nodes in
		> your nodelist.
		
		
		I think that's the way I'm going to have to go, but I was
really hoping not
		to.  Thanks for the suggestion!


	i think i have what youre looking for Tim, take a look at this
script output
	
	nathan@devel ~ $ php testDom.php 
	IN:
	<?xml version="1.0" standalone="yes"?>
	<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
"http://www.w3.org/TR/REC-html40/loose.dtd";>
	<html><body>Test<br/><h2>quickshiftin@xxxxxxxxx<a name="bar">stuff
inside the link</a>Foo</h2><p>care</p><p>yoyser</p></body></html>
	
	OUT: 
	<?xml version="1.0" standalone="yes"?>
	<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
"http://www.w3.org/TR/REC-html40/loose.dtd";>
	<html><body>Test<br/><h2><a
href="mailto:quickshiftin@xxxxxxxxx";>quickshiftin@xxxxxxxxx</a><a
name="bar">stuff inside the
link</a>Foo</h2><p>care</p><p>yoyser</p></body></html>
	
	and heres the code using the DOM extension
	you may have to tweak it to suit your needs, but currently i think
it does the trick ;)
	
	<?php
	$doc = new DOMDocument();
	$doc->loadHTML('<html><body>Test<br><h2>quickshiftin@xxxxxxxxx<a
name="bar">stuff inside the
link</a>Foo</h2><p>care</p><p>yoyser</p></body></html>');
	echo 'IN:' . PHP_EOL . $doc->saveXML() . PHP_EOL;
	findTextNodes($doc->getElementsByTagName('*'),
'convertToLinkIfNecc');
	echo 'OUT: ' .  PHP_EOL . $doc->saveXML() . PHP_EOL;
	
	/**
	 * run through a DOMNodeList, looking for text nodes.  apply a
callback to
	 * all such text nodes that are encountered
	 */
	function  findTextNodes(DOMNodeList $nodesToSearch, $callback) {
	    foreach($nodesToSearch as $curNode) {
	        if($curNode->hasChildNodes())
	            foreach($curNode->childNodes as $curChild)
	                if($curChild instanceof DOMText)
	                    #echo "TEXT NODE FOUND: " . $curChild->nodeValue
. PHP_EOL;
	                    /// todo: allow use of hook here
	                    call_user_func($callback, $curNode, $curChild);
	    }
	}
	
	/**
	 * determine if a node should be modified, by chcking to see if a
child is a text node
	 * and the text looks like an email address.
	 * call a subordinate function to convert the text node into a
mailto anchor DOMElement
	 */
	function convertToLinkIfNecc(DomElement $textContainer, DOMText
$textNode) {
	    if( (strtolower($textContainer->nodeName) != 'a') &&
	        (filter_var($textNode->nodeValue, FILTER_VALIDATE_EMAIL) !==
false) ) {
	        convertMailtoToAnchor($textContainer, $textNode);
	    }
	}
	
	/**
	 * modify a DOMElement that has a DOMText node as a child; create a
DOMElement
	 * that represents and a tag, and set the value and href attirbute,
so that it
	 * acts as a 'mailto' link
	 */
	function convertMailtoToAnchor(DomElement $textContainer, DOMText
$textNode) {
	    $newNode = new DomElement('a', $textNode->nodeValue);
	    $textContainer->replaceChild($newNode, $textNode);
	    $newNode->setAttribute('href', "mailto:{$textNode->nodeValue}");
	}
	
	
	-nathan 
	




-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux