On Thu, Dec 02, 2021 at 02:16:12PM +0100, Eugene Crosser wrote: > Hello, > > there is read-from-the-socket loop in src/iface.c line 90 (function > iface_cache_update()), and it (and other places) call macro > netlink_init_error() to report error. The function behind the macro is > in src/netlink.c line 81, and it calls exit(NFT_EXIT_NONL) after writing > a message to stderr. > > I see two problems with this: > > 1. All read-from-the-socket functions should be run in a loop, repeating > if return code is -1 and errno is EINTR. I.e. EINTR should not be > treated as an error, but as a condition that requires retry. > > 2. Library functions are not supposed to call exit() (or abort() for > that matter). They are expected to return an error indication to the > caller, who may have its own strategy for handling error conditions. > > Case in point, we have a daemon (in Python) that uses bindings to > libnftables. It's a service responding to requests coming over a TCP > connection, and it takes care to intercept any error situations and > report them back. We discovered that under some conditions, it just > closes the socket and goes away. This being a daemon, stderr was not > immediately accessible; and even it it were, it is pretty hard to figure > where did the message "iface.c:98: Unable to initialize Netlink socket: > Interrupted system call" come from and why! This missing EINTR handling for iface_cache_update() is a bug, would you post a patch for this? > There is another function that calls exit(), __netlink_abi_error(). I > believe that even in such a harsh situation, exit() is not the right way > to handle it. ABI breakage between kernel and userspace should not ever happen.