Suboptimal error handling in libnftables

Eugene Crosser <crosser@xxxxxxxxxxx> · Thu, 2 Dec 2021 14:16:12 +0100

Hello,

there is read-from-the-socket loop in src/iface.c line 90 (function
iface_cache_update()), and it (and other places) call macro
netlink_init_error() to report error. The function behind the macro is
in src/netlink.c line 81, and it calls exit(NFT_EXIT_NONL) after writing
a message to stderr.

I see two problems with this:

1. All read-from-the-socket functions should be run in a loop, repeating
if return code is -1 and errno is EINTR. I.e. EINTR should not be
treated as an error, but as a condition that requires retry.

2. Library functions are not supposed to call exit() (or abort() for
that matter). They are expected to return an error indication to the
caller, who may have its own strategy for handling error conditions.

Case in point, we have a daemon (in Python) that uses bindings to
libnftables. It's a service responding to requests coming over a TCP
connection, and it takes care to intercept any error situations and
report them back. We discovered that under some conditions, it just
closes the socket and goes away. This being a daemon, stderr was not
immediately accessible; and even it it were, it is pretty hard to figure
where did the message "iface.c:98: Unable to initialize Netlink socket:
Interrupted system call" come from and why!

There is another function that calls exit(), __netlink_abi_error(). I
believe that even in such a harsh situation, exit() is not the right way
to handle it.

Thank you,

Eugene
Attachment:
OpenPGP_signature

Description: OpenPGP digital signature