Re: [PATCH nftables 3/3] src: mnl: retry when we hit -ENOBUFS

Eric Garver <eric@xxxxxxxxxxx> · Mon, 9 Sep 2019 17:11:10 -0400

Hi Florian,

On Tue, Aug 13, 2019 at 10:12:46PM +0200, Florian Westphal wrote:
> tests/shell/testcases/transactions/0049huge_0
> 
> still fails with ENOBUFS error after endian fix done in
> previous patch.  Its enough to increase the scale factor (4)
> on s390x, but rather than continue with these "guess the proper
> size" game, just increase the buffer size and retry up to 3 times.
> 
> This makes above test work on s390x.
> 
> So, implement what Pablo suggested in the earlier commit:
>     We could also explore increasing the buffer and retry if
>     mnl_nft_socket_sendmsg() hits ENOBUFS if we ever hit this problem again.
> 
> v2: call setsockopt unconditionally, then increase on error.
> 
> Signed-off-by: Florian Westphal <fw@xxxxxxxxx>
> ---
>  src/mnl.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mnl.c b/src/mnl.c
> index 97a2e0765189..9c1f5356c9b9 100644
> --- a/src/mnl.c
> +++ b/src/mnl.c
> @@ -311,6 +311,7 @@ int mnl_batch_talk(struct netlink_ctx *ctx, struct list_head *err_list,
>  	int ret, fd = mnl_socket_get_fd(nl), portid = mnl_socket_get_portid(nl);
>  	uint32_t iov_len = nftnl_batch_iovec_len(ctx->batch);
>  	char rcv_buf[MNL_SOCKET_BUFFER_SIZE];
> +	unsigned int enobuf_restarts = 0;
>  	size_t avg_msg_size, batch_size;
>  	const struct sockaddr_nl snl = {
>  		.nl_family = AF_NETLINK
> @@ -320,6 +321,7 @@ int mnl_batch_talk(struct netlink_ctx *ctx, struct list_head *err_list,
>  		.tv_usec	= 0
>  	};
>  	struct iovec iov[iov_len];
> +	unsigned int scale = 4;
>  	struct msghdr msg = {};
>  	fd_set readfds;
>  
> @@ -328,7 +330,8 @@ int mnl_batch_talk(struct netlink_ctx *ctx, struct list_head *err_list,
>  	batch_size = mnl_nft_batch_to_msg(ctx, &msg, &snl, iov, iov_len);
>  	avg_msg_size = div_round_up(batch_size, num_cmds);
>  
> -	mnl_set_rcvbuffer(ctx->nft->nf_sock, num_cmds * avg_msg_size * 4);
> +restart:
> +	mnl_set_rcvbuffer(ctx->nft->nf_sock, num_cmds * avg_msg_size * scale);
>  
>  	ret = mnl_nft_socket_sendmsg(ctx, &msg);
>  	if (ret == -1)
> @@ -347,8 +350,13 @@ int mnl_batch_talk(struct netlink_ctx *ctx, struct list_head *err_list,
>  			break;
>  
>  		ret = mnl_socket_recvfrom(nl, rcv_buf, sizeof(rcv_buf));
> -		if (ret == -1)
> +		if (ret == -1) {
> +			if (errno == ENOBUFS && enobuf_restarts++ < 3) {
> +				scale *= 2;
> +				goto restart;
> +			}

If this restart is triggered it causes rules to be duplicated. We send
the same batch again.

I'm hitting this on x86_64. Maybe we need find a better way to estimate
the rcvbuffer in the case of --echo. By the time we see ENOBUFS we're
already in a bad way - events have already be lost.

>  			return -1;
> +		}
>  
>  		ret = mnl_cb_run(rcv_buf, ret, 0, portid, &netlink_echo_callback, ctx);
>  		/* Continue on error, make sure we get all acknowledgments */
> -- 
> 2.21.0
>