On Mon, Dec 06, 2021 at 05:58:25PM +0100, Eugene Crosser wrote: > On 02/12/2021 14:54, Pablo Neira Ayuso wrote: > > On Thu, Dec 02, 2021 at 02:16:12PM +0100, Eugene Crosser wrote: > >> Hello, > >> > >> there is read-from-the-socket loop in src/iface.c line 90 (function > >> iface_cache_update()), and it (and other places) call macro > >> netlink_init_error() to report error. The function behind the macro is > >> in src/netlink.c line 81, and it calls exit(NFT_EXIT_NONL) after writing > >> a message to stderr. > >> > >> I see two problems with this: > >> > >> 1. All read-from-the-socket functions should be run in a loop, repeating > >> if return code is -1 and errno is EINTR. I.e. EINTR should not be > >> treated as an error, but as a condition that requires retry. > >> > >> 2. Library functions are not supposed to call exit() (or abort() for > >> that matter). They are expected to return an error indication to the > >> caller, who may have its own strategy for handling error conditions. > >> > >> Case in point, we have a daemon (in Python) that uses bindings to > >> libnftables. It's a service responding to requests coming over a TCP > >> connection, and it takes care to intercept any error situations and > >> report them back. We discovered that under some conditions, it just > >> closes the socket and goes away. This being a daemon, stderr was not > >> immediately accessible; and even it it were, it is pretty hard to figure > >> where did the message "iface.c:98: Unable to initialize Netlink socket: > >> Interrupted system call" come from and why! > > > > This missing EINTR handling for iface_cache_update() is a bug, would > > you post a patch for this? > > It looks like there is more than just a missing retry when > socket-receive returns EINTR. In the code in src/iface.c between lines > 87 and 98, EINTR may come from one of two functions: > mnl_socket_recvfrom() and mnl_cb_run(). If it is returned by > mnl_socket_recvfrom(), the correct course of action is to blindly call > that function again. But when it is returned by mnl_cb_run(), the > meaning is different. mnl_cb_run() retruns -1 and sets errno=EINTR when > netlink message contained NLM_F_DUMP_INTR. I assume (though I am not > sure) that NLM_F_DUMP_INTR means that the data that is being transferred > has changed while transfer was only partially complete, and the user is > advised to restart _the whole dump process_ by sending a new NLM_F_DUMP > request message. (Arguably, libmnl ought to report such situation with > some other indication, not EINTR.) In any case, I believe that the > aforementioned code should handle both of these two "need to retry" cases. > > I our tests it looks like we are hitting NLM_F_DUMP_INTR rather then > interrupted socket recv(). I will report back after this is verified. NLM_F_DUMP_INTR means that the existing netlink dump is stale (an update happened while dumping the listing to userspace), so userspace has to restart to get a consistent listing from the kernel. Yes, in both cases, either signal interruped syscall or NLM_F_DUMP_INTR, libnftables should retry.
Attachment:
signature.asc
Description: PGP signature