Re: [PATCH bpf-next] libbpf: Use dynamically allocated buffer when receiving netlink messages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 11, 2022 at 11:51 AM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
>
> When receiving netlink messages, libbpf was using a statically allocated
> stack buffer of 4k bytes. This happened to work fine on systems with a 4k
> page size, but on systems with larger page sizes it can lead to truncated
> messages. The user-visible impact of this was that libbpf would insist no
> XDP program was attached to some interfaces because that bit of the netlink
> message got chopped off.
>
> Fix this by switching to a dynamically allocated buffer; we borrow the
> approach from iproute2 of using recvmsg() with MSG_PEEK|MSG_TRUNC to get
> the actual size of the pending message before receiving it, adjusting the
> buffer as necessary. While we're at it, also add retries on interrupted
> system calls around the recvmsg() call.
>
> Reported-by: Zhiqian Guan <zhguan@xxxxxxxxxx>
> Fixes: 8bbb77b7c7a2 ("libbpf: Add various netlink helpers")
> Signed-off-by: Toke Høiland-Jørgensen <toke@xxxxxxxxxx>
> ---
>  tools/lib/bpf/netlink.c | 55 ++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 52 insertions(+), 3 deletions(-)
>
> diff --git a/tools/lib/bpf/netlink.c b/tools/lib/bpf/netlink.c
> index c39c37f99d5c..9a6e95206bf0 100644
> --- a/tools/lib/bpf/netlink.c
> +++ b/tools/lib/bpf/netlink.c
> @@ -87,22 +87,70 @@ enum {
>         NL_DONE,
>  };
>
> +static int __libbpf_netlink_recvmsg(int sock, struct msghdr *mhdr, int flags)

let's not use names starting with underscored. Just call it
"netlink_recvmsg" or something like that.

> +{
> +       int len;
> +
> +       do {
> +               len = recvmsg(sock, mhdr, flags);

recvmsg returns ssize_t, is it ok to truncate to int?


> +       } while (len < 0 && (errno == EINTR || errno == EAGAIN));
> +
> +       if (len < 0)
> +               return -errno;
> +       return len;
> +}
> +
> +static int libbpf_netlink_recvmsg(int sock, struct msghdr *mhdr, char **buf)
> +{
> +       struct iovec *iov = mhdr->msg_iov;
> +       void *nbuf;
> +       int len;
> +
> +       len = __libbpf_netlink_recvmsg(sock, mhdr, MSG_PEEK | MSG_TRUNC);
> +       if (len < 0)
> +               return len;
> +
> +       if (len < 4096)
> +               len = 4096;
> +
> +       if (len > iov->iov_len) {
> +               nbuf = realloc(iov->iov_base, len);
> +               if (!nbuf) {
> +                       free(iov->iov_base);
> +                       return -ENOMEM;
> +               }
> +               iov->iov_base = nbuf;

this function both sets iov->iov_base *and* returns buf. It's quite a
convoluted contract. Seems like buf is not necessary (and also NULL
out iov->iov_base in case of error above?). But it might be cleaner to
do this MSG_PEEK  + realloc + recvmsg  in libbpf_netlink_recv()
explicitly. It's only one place.


> +               iov->iov_len = len;
> +       }
> +
> +       len = __libbpf_netlink_recvmsg(sock, mhdr, 0);
> +       if (len > 0)
> +               *buf = iov->iov_base;
> +       return len;
> +}
> +
>  static int libbpf_netlink_recv(int sock, __u32 nl_pid, int seq,
>                                __dump_nlmsg_t _fn, libbpf_dump_nlmsg_t fn,
>                                void *cookie)
>  {
> +       struct iovec iov = {};
> +       struct msghdr mhdr = {
> +               .msg_iov = &iov,
> +               .msg_iovlen = 1,
> +       };
>         bool multipart = true;
>         struct nlmsgerr *err;
>         struct nlmsghdr *nh;
> -       char buf[4096];
>         int len, ret;
> +       char *buf;
> +
>
>         while (multipart) {
>  start:
>                 multipart = false;
> -               len = recv(sock, buf, sizeof(buf), 0);
> +               len = libbpf_netlink_recvmsg(sock, &mhdr, &buf);
>                 if (len < 0) {
> -                       ret = -errno;
> +                       ret = len;
>                         goto done;
>                 }
>
> @@ -151,6 +199,7 @@ static int libbpf_netlink_recv(int sock, __u32 nl_pid, int seq,
>         }
>         ret = 0;
>  done:
> +       free(iov.iov_base);

double free on -ENOMEM? And even more confusing why you bother with
buf at all...

>         return ret;
>  }
>
> --
> 2.35.1
>




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux