Re: [syzbot] upstream boot error: WARNING in netlink_ack

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 04, 2022 at 10:42:53AM -0700, Jakub Kicinski wrote:
> On Tue, 04 Oct 2022 07:36:55 -0700 Kees Cook wrote:
> > This is fixed in the pending netdev tree coming for the merge window.
> 
> This has been weighing on my conscience a little, I don't like how we
> still depend on putting one length in the skb and then using a
> different one for the actual memcpy(). How would you feel about this
> patch on top (untested):

tl;dr: yes, I like it. Please add a nlmsg_contents member. :)

Rambling below...

> 
> diff --git a/include/net/netlink.h b/include/net/netlink.h
> index 4418b1981e31..6ad671441dff 100644
> --- a/include/net/netlink.h
> +++ b/include/net/netlink.h
> @@ -931,6 +931,29 @@ static inline struct nlmsghdr *nlmsg_put(struct sk_buff *skb, u32 portid, u32 se
>  	return __nlmsg_put(skb, portid, seq, type, payload, flags);
>  }
>  
> +/**
> + * nlmsg_append - Add more data to a nlmsg in a skb
> + * @skb: socket buffer to store message in
> + * @nlh: message header
> + * @payload: length of message payload
> + *
> + * Append data to an existing nlmsg, used when constructing a message
> + * with multiple fixed-format headers (which is rare).
> + * Returns NULL if the tailroom of the skb is insufficient to store
> + * the extra payload.
> + */
> +static inline void *nlmsg_append(struct sk_buff *skb, struct nlmsghdr *nlh,

nlh not needed here?

> +				 u32 size)
> +{
> +	if (unlikely(skb_tailroom(skb) < NLMSG_ALIGN(size)))
> +		return NULL;
> +
> +	if (!__builtin_constant_p(size) || NLMSG_ALIGN(size) - size != 0)

why does a fixed size mean no memset?

> +		memset(skb_tail_pointer(skb) + size, 0,
> +		       NLMSG_ALIGN(size) - size);
> +	return __skb_put(NLMSG_ALIGN(size));
> +}
> +
>  /**
>   * nlmsg_put_answer - Add a new callback based netlink message to an skb
>   * @skb: socket buffer to store message in
> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> index a662e8a5ff84..bb3d855d1f57 100644
> --- a/net/netlink/af_netlink.c
> +++ b/net/netlink/af_netlink.c
> @@ -2488,19 +2488,28 @@ void netlink_ack(struct sk_buff *in_skb, struct nlmsghdr *nlh, int err,
>  		flags |= NLM_F_ACK_TLVS;
>  
>  	skb = nlmsg_new(payload + tlvlen, GFP_KERNEL);
> -	if (!skb) {
> -		NETLINK_CB(in_skb).sk->sk_err = ENOBUFS;
> -		sk_error_report(NETLINK_CB(in_skb).sk);
> -		return;
> -	}
> +	if (!skb)
> +		goto err_bad_put;
>  
>  	rep = nlmsg_put(skb, NETLINK_CB(in_skb).portid, nlh->nlmsg_seq,
> -			NLMSG_ERROR, payload, flags);
> +			NLMSG_ERROR, sizeof(*errmsg), flags);
> +	if (!rep)
> +		goto err_bad_put;
>  	errmsg = nlmsg_data(rep);
>  	errmsg->error = err;
> -	unsafe_memcpy(&errmsg->msg, nlh, payload > sizeof(*errmsg)
> -					 ? nlh->nlmsg_len : sizeof(*nlh),
> -		      /* Bounds checked by the skb layer. */);
> +	memcpy(&errmsg->msg, nlh, sizeof(*nlh));
> +
> +	if (!(flags & NLM_F_CAPPED)) {

Should it test this flag, or test if the sizes show the need for "extra"
payload length?

I always found the progression of sizes here to be confusing. "payload"
starts as sizeof(*errmsg), and gets nlmsg_len(nlh) added but only when if
"(err && !(nlk->flags & NETLINK_F_CAP_ACK)" was true. Why is
nlmsg_len(nlh) _wrong_ if the rest of its contents are correct? If this
was "0" in the other state, the logic would just be:

	nlh_bytes = nlmsg_len(nlh);
	total  = sizeof(*errmsg);
	total += nlh_bytes;
	total += tlvlen;

and:

	nlmsg_new(total, ...);
	... nlmsg_put(..., sizeof(*errmsg), ...);
	...
	errmsg->error = err;
	errmsg->nlh = *nlh;
	if (nlh_bytes) {
		data = nlmsg_append(..., nlh_bytes), ...);
		...
		memcpy(data, nlh->nlmsg_contents, nlh_bytes);
	}

> +		size_t data_len = nlh->nlmsg_len - sizeof(*nlh);

I think data_len here is also "payload - sizeof(*errmsg)"? So if it's >0,
we need to append the nlh contents.

> +		void *data;
> +
> +		data = nlmsg_append(skb, rep, data_len);
> +		if (!data)
> +			goto err_bad_put;
> +
> +		/* the nlh + 1 is probably going to make you unhappy? */

Right, the compiler may think it is an object no larger than sizeof(*nlh).
My earliest attempt at changes here introduced a flex-array for the
contents, and split the memcpy:
https://lore.kernel.org/lkml/d7251d92-150b-5346-6237-52afc154bb00@xxxxxxxxxxxxxxxxxx/
which is basically the solution you have here, except it wasn't having
the nlmsg_*-helpers do the bounds checking.

> +		memcpy(data, nlh + 1, data_len);

So with the struct nlmsghdr::nlmsg_contents member, this becomes:

		memcpy(data, nlh->nlmsg_contents, data_len);

-- 
Kees Cook



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux