Here's my function -
private function filterAttributes($node) {
// filters the attribute names and content
$attributes = $node->attributes;
foreach ($attributes as $attribute) {
// allow colon as it is used in namespace attributes -
// needs to be tested though, may require different handling??
// I should get a MathML document and try it out.
$pattern = '/[^a-z0-9:-]+/i';
$clean = strtolower(preg_replace($pattern,'',$attribute->name));
if (strcmp($clean,$attribute->name) != 0) {
$this->policyReport("Invalid Attribute Name");
}
$saniAtt[] = $clean;
if (strcmp($clean,"value") != 0) {
if ($clean == "src") {
$saniVal[] = $this->obfus($attribute->value,1);
} elseif ($clean == "data") {
$saniVal[] = $this->obfus($attribute->value,1);
} elseif ($clean == "code") {
$saniVal[] = $this->obfus($attribute->value,1);
} else {
$saniVal[] = $this->obfus($attribute->value);
}
} else {
// do not alter value attributes
$saniVal[] = $attribute->value;
}
$oldAtt[] = $attribute->name;
}
if (isset($oldAtt)) {
for ($i=0; $i<sizeof($oldAtt);$i++) {
$node->removeAttribute($oldAtt[$i]);
}
}
if (isset($saniAtt)) {
for ($i=0; $i<sizeof($saniAtt);$i++) {
$check = " " . $saniAtt[$i] . " ";
if (substr_count($this->blacklist, $check) == 0) {
$node->setAttribute($saniAtt[$i],$saniVal[$i]);
} else {
$string = "Blacklisted Event Attribute: " . $saniAtt[$i];
$this->policyReport($string);
}
}
}
}
(entire class here - http://www.clfsrpm.net/xss/cspfilter_class.phps)
Here's the problem -
$attributes = $node->attributes;
creates a list that has both regular attributes and namespaced
attributes. But I don't know how to programatically tell them apart.
Here's the problem - when the attribute involves a namespace, IE xml:lang -
$node->removeAttribute($oldAtt[$i]);
doesn't remove it.
$node->setAttribute($saniAtt[$i],$saniVal[$i]);
creates a new attribute WITHOUT the namespace.
So if we have
xml:lang="something"
after the function is run, the result is that there is an additional
attribute lang="filtered something"
but xml:lang remains with the unfiltered attribute content.
If I knew a way to tell whether or not an attribute was namespaced I
could deal with it by using the correct $node->removeAttributeNS and
$node->setAttributeNS for those attributes, but I don't know how to tell
them apart programatically.
It seems that $attribute->name when the attribute is foo:bar will just
return bar, and I can't tell if it was originally foo:bar, xml:bar,
freak:bar, or just plain bar.
The extremely sparse documentation in the php manual on this area isn't
exactly helping me figure it out.
Any help would be appreciated.
To see the problem -
http://www.clfsrpm.net/xss/dom_script_test.php
Put
<p xml:bar = "javascript:something else">A Paragraph</p>
into the textarea and hit submit - and you'll see what the function does
with the attribute.
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php