Re: How to best handle NULL return from xmlNodeGetContent()

Laine Stump <laine@xxxxxxxxxx> · Wed, 8 Jul 2020 13:14:35 -0400

On 7/8/20 4:35 AM, Daniel P. Berrangé wrote:
On Tue, Jul 07, 2020 at 06:48:57PM -0400, Laine Stump wrote:
libvirt has several uses of xmlNodeGetContent() (from libxml2) added at
different times over the years. Some of those uses report an Out of Memory
error when xmlNodeGetContent() returns NULL, and some of them ignore a NULL
return (treating it as if it were ""), and some just assume that the return
will never be NULL, but always at least a pointer to "".

I ran across this when I noticed a usage of the latter type - it wasn't
checking for NULL at all. A lack of check seemed troubling, so I looked at
other uses within libvirt and found the hodge-podge described above, so no
help there in determining the right thing to do. I then looked at the
libxml2 documentation for xmlNodeGetContent(), which says:

   Returns: a new #xmlChar * or NULL if no content is available.

To an uninformed outsider, this sounds like the function could return NULL
simply if the node was empty (e.g. "<wwn/>"). But when we look at the return
from xmlNodeGetContent() for this example, it says that the content is "",
not NULL.

In the meantime, since libxml doesn't abort on OOM errors (as libvirt does),
it could also be possible that it's returning NULL due to OOM. So using
anecdotal evidence acquired so far, one *could* surmise that any time
libvirt gets a NULL return from xmlNodeGetContent(), it is indeed an OOM
error.
Looking at the source for xmlNodeGetContent() that is definitely not
the case.

The purist in me thinks that isn't right, though - I took a quick look at
the libxml code and saw cases where  it returns NULL that don't seem related
to OOM, but rather to the type of node or something. But being an outsider
and not wanting to learn any more than necessary about the internals of
libxml, I'm not sure if any of those cases even apply to libvirt's simple
use of xmlNodeGetContent().
I think we have to expect that the node type will not match what we
want it to be, when faced with malicious XML user input.

So, in the end I just want to modify libvirt's dozen or so calls to
xmlNodeGetContent() to consistently do the right thing, but first I want to
learn the true answers to these questions:

1) Keeping in mind that we've already successfully parsed the XML, will
calls to xmlNodeGetContent() in the simple cases as when libvirt calls it
only return NULL for OOM, but not for any other reason?
No, there's other reasons it will return NULL that could hit us.

Okay, I kind of figured that was the case, and asked the question as a 
strawman.

2) If not, is the proper way to distinguish OOM in this case to call
xmlGetLasterror(), and check if the domain is XML_FROM_MEMORY?
There are a bunch of cases in xmlNodeGetContent() which return NULL
without ever bothering to update the last error indicator.

Sigh. Okay.

  Also
xmlGetLastError is not using a thread local, so it is completely
unsafe to use it in any modern app that has threads :-(

Well, the documentation says this:

  "Get the last global error registered. This is

   per thread if compiled with thread support."

and I certainly *hope* that libxml2 is being compiled everywhere with 
support for multiple threads (if it's not, then we could very well have 
bigger problems than just improper error handling). But that's a moot 
point if it's not being set anyway :-/

3) Aside from returning NULL in the case of errors, would it ever be
possible for correct XML to return NULL as valid "node content", or is it
always an error of some kind?
It does look like it only happens in error, but many of the users
look like that can be triggered from user chosen input, so we must
not abort.

Right. The only case where I would think of aborting is OOM. In any 
other case we have to log some sort of message to let the user know how 
to avoid the same problem next time.

Since libvirt now aborts on OOM, an OOM error could be handled in one place
by a wrapper function around xmlNodeGetContent() (we already have such a
function, currently a one-liner passthrough, and not called by everyone).
But if there is any chance that any other libxml error could be encountered,
then I suppose we really should be reporting those without aborting, and
then still checking for NULL on return from the wrapper function (presumably
by just logging the contents of "message" from the xmlErrorPtr returned from
xmlGetLastError().
xmlNodeGetContent() looks unusably broken to me in terms of error handling.

I notice however that the xmlNodePtr struct contents is actually fully
public in the API:

    http://www.xmlsoft.org/html/libxml-tree.html#xmlNode

Given this, we can avoid xmlNodeGetContent entirely and just access the
node->content field directly, after first validating node->type.

Ah, right. I recall seeing at least one place in libvirt where we 
already do that. When I saw it I thought that was very bad style, since 
there's an official API function to get the value. But if the official 
API function is broken, and the struct in question is public in the API 
anyway, then that seems like the least worst thing to do.

  This
would give us reliable error handling. This is probably useful to replace
other libxml functions too -eg  xmlGetProp looks like it has the same
problems with error reporting

Yeah, I was wondering about xmlGetProp() too, but didn't want to muddy 
up the conversation with too many details right away.

I'll look into rewriting virXMLNodeContentString to directly access 
node->content, then change all the callers of xmlNodeGetContent() to use 
that as well as properly handling a NULL return.