Re: Suboptimal error handling in libnftables

Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> · Mon, 6 Dec 2021 20:56:18 +0100

On Mon, Dec 06, 2021 at 05:58:25PM +0100, Eugene Crosser wrote:
> On 02/12/2021 14:54, Pablo Neira Ayuso wrote:
> > On Thu, Dec 02, 2021 at 02:16:12PM +0100, Eugene Crosser wrote:
> >> Hello,
> >>
> >> there is read-from-the-socket loop in src/iface.c line 90 (function
> >> iface_cache_update()), and it (and other places) call macro
> >> netlink_init_error() to report error. The function behind the macro is
> >> in src/netlink.c line 81, and it calls exit(NFT_EXIT_NONL) after writing
> >> a message to stderr.
> >>
> >> I see two problems with this:
> >>
> >> 1. All read-from-the-socket functions should be run in a loop, repeating
> >> if return code is -1 and errno is EINTR. I.e. EINTR should not be
> >> treated as an error, but as a condition that requires retry.
> >>
> >> 2. Library functions are not supposed to call exit() (or abort() for
> >> that matter). They are expected to return an error indication to the
> >> caller, who may have its own strategy for handling error conditions.
> >>
> >> Case in point, we have a daemon (in Python) that uses bindings to
> >> libnftables. It's a service responding to requests coming over a TCP
> >> connection, and it takes care to intercept any error situations and
> >> report them back. We discovered that under some conditions, it just
> >> closes the socket and goes away. This being a daemon, stderr was not
> >> immediately accessible; and even it it were, it is pretty hard to figure
> >> where did the message "iface.c:98: Unable to initialize Netlink socket:
> >> Interrupted system call" come from and why!
> > 
> > This missing EINTR handling for iface_cache_update() is a bug, would
> > you post a patch for this?
> 
> It looks like there is more than just a missing retry when
> socket-receive returns EINTR. In the code in src/iface.c between lines
> 87 and 98, EINTR may come from one of two functions:
> mnl_socket_recvfrom() and mnl_cb_run(). If it is returned by
> mnl_socket_recvfrom(), the correct course of action is to blindly call
> that function again. But when it is returned by mnl_cb_run(), the
> meaning is different. mnl_cb_run() retruns -1 and sets errno=EINTR when
> netlink message contained NLM_F_DUMP_INTR. I assume (though I am not
> sure) that NLM_F_DUMP_INTR means that the data that is being transferred
> has changed while transfer was only partially complete, and the user is
> advised to restart _the whole dump process_ by sending a new NLM_F_DUMP
> request message. (Arguably, libmnl ought to report such situation with
> some other indication, not EINTR.) In any case, I believe that the
> aforementioned code should handle both of these two "need to retry" cases.
> 
> I our tests it looks like we are hitting NLM_F_DUMP_INTR rather then
> interrupted socket recv(). I will report back after this is verified.

NLM_F_DUMP_INTR means that the existing netlink dump is stale (an
update happened while dumping the listing to userspace), so userspace
has to restart to get a consistent listing from the kernel.

Yes, in both cases, either signal interruped syscall or
NLM_F_DUMP_INTR, libnftables should retry.
Attachment:
signature.asc

Description: PGP signature