Re: DOMElement: td vs. th

Andy Theuninck <gohanman@xxxxxxxxx> · Thu, 11 Mar 2010 15:21:13 -0600

Gotcha, wasn't thinking straight. Turns out it doesn't really have to
be a legal-HTML attribute anyway, so I can just do:
str_replace('<th','<th fakeattr="blah" ',$str)

On Thu, Mar 11, 2010 at 3:01 PM, Rene Veerman <rene7705@xxxxxxxxx> wrote:
> So in other words; it's the library that you fix with wrapper
> functions, not the reports (outside the scope of using the library).
>
> On Thu, Mar 11, 2010 at 9:59 PM, Rene Veerman <rene7705@xxxxxxxxx> wrote:
>> function readyForDOM_report($originalReportAsText) {
>>  return str_replace ('<th', '<th class="transportTH"', $originalReportAsText);
>> }
>>
>> $dom = new DOMDocument();
>> $dom->loadHTML(readyForDOM_report($str));
>> $tables = $dom->getElementsByTagName("table");
>> $rows = $tables->item(0)->getElementsByTagName('tr');
>> foreach($rows as $row){
>>   foreach($row->childNodes as $node)
>>        // check $node for having a classname 'transportTH'.
>> }
>>
>> the only problem i foresee is <th>s in your reports already having a
>> class="something" set, which could mess it up. you'd need to check
>> that. but in that case you can always pump the original $str to the
>> DOM, and use multiple $k's from foreach ($arr as $k=>$v) to get to the
>> corresponding node, and have the original class name.
>>
>>
>>
>>
>>
>> On Thu, Mar 11, 2010 at 9:52 PM, Andy Theuninck <gohanman@xxxxxxxxx> wrote:
>>> I could could, but that would kind of defeat the point of the project
>>> (I'm trying to capture a bunch of existing HTML reports via output
>>> buffering and transform the tables into proper XLS. Tweaking every
>>> single report is exactly what I'm trying to avoid).
>>>
>>> On Thu, Mar 11, 2010 at 2:45 PM, Rene Veerman <rene7705@xxxxxxxxx> wrote:
>>>> hmm lame bug... but you can add a classname to the <th>s and check for that?..
>>>>
>>>> On Thu, Mar 11, 2010 at 9:34 PM, Andy Theuninck <gohanman@xxxxxxxxx> wrote:
>>>>> I'm trying to parse a string containing an HTML table using the
>>>>> builtin DOM classes and running into an odd problem.
>>>>>
>>>>> Here's what I'm doing:
>>>>> $dom = new DOMDocument();
>>>>> $dom->loadHTML($str);
>>>>> $tables = $dom->getElementsByTagName("table");
>>>>> $rows = $tables->item(0)->getElementsByTagName('tr');
>>>>> foreach($rows as $row){
>>>>>    foreach($row->childNodes as $node)
>>>>>         // stuff
>>>>> }
>>>>>
>>>>> This gives me the row elements in order and access to their contents.
>>>>> The weird part is $node always appears to be a td tag - even when it's
>>>>> a th tag in the original string (DOMElement::tagName is always "td"
>>>>> (as well as DOMNode::nodeName and DOMNode::localName)). The th tags
>>>>> definitely aren't being omitted; I still get nodes with their
>>>>> contents, just with the wrong tag name.
>>>>>
>>>>> Is there any way to override this behavior so that I can distinguish
>>>>> between td tags and th tags?
>>>>>
>>>>> --
>>>>> PHP General Mailing List (http://www.php.net/)
>>>>> To unsubscribe, visit: http://www.php.net/unsub.php
>>>>>
>>>>>
>>>>
>>>
>>
>

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php