Re: [PATCH net-next v2 11/12] net: homa: create homa_plumbing.c homa_utils.c

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Dec 1, 2024 at 7:51 PM D. Wythe <alibuda@xxxxxxxxxxxxxxxxx> wrote:
> > +int homa_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval,
> > +                 unsigned int optlen)
> > +{
> > +     struct homa_sock *hsk = homa_sk(sk);
> > +     struct homa_set_buf_args args;
> > +     int ret;
> > +
> > +     if (level != IPPROTO_HOMA || optname != SO_HOMA_SET_BUF ||
> > +         optlen != sizeof(struct homa_set_buf_args))
> > +             return -EINVAL;
>
> SO_HOMA_SET_BUF is a bit odd here, maybe HOMA_RCVBUF ? which also can be
> implemented in getsockopt.

I have changed it to HOMA_RCVBUF (and renamed struct homa_set_buf_args
to struct homa_rcvbuf_args). I also implemented getsockopt for
HOMA_RCVBUF.

> > +
> > +     if (copy_from_sockptr(&args, optval, optlen))
> > +             return -EFAULT;
> > +
> > +     /* Do a trivial test to make sure we can at least write the first
> > +      * page of the region.
> > +      */
> > +     if (copy_to_user((__force void __user *)args.start, &args, sizeof(args)))
> > +             return -EFAULT;
>
> To share buffer between kernel and userspace, maybe you should refer to the implementation of
> io_pin_pbuf_ring()

I'm not sure what you mean here. Are you suggesting that I look at the
code of io_pin_pbuf_ring to make sure I've done everything needed to
share buffers? I don't believe that Homa needs to do anything special
(e.g. it doesn't need to pin the user's buffers); it just saves the
address and makes copy_to_user calls later when needed (and these
calls are all done at syscall level in the context of the
application). Is there something I'm missing?

> > +
> > +     homa_sock_lock(hsk, "homa_setsockopt SO_HOMA_SET_BUF");
> > +     ret = homa_pool_init(hsk, (__force void __user *)args.start, args.length);
> > +     homa_sock_unlock(hsk);
>
> It seems that if the option was not set, poll->hsk will not be initialized
> and then if recvmsg is called, a panic will be triggered.

I fixed homa_pool_check_waiting to properly handle empty pools.

> > +/**
> > + * homa_sendmsg() - Send a request or response message on a Homa socket.
> > + * @sk:     Socket on which the system call was invoked.
> > + * @msg:    Structure describing the message to send; the msg_control
> > + *          field points to additional information.
> > + * @length: Number of bytes of the message.
> > + * Return: 0 on success, otherwise a negative errno.
> > + */
> > +int homa_sendmsg(struct sock *sk, struct msghdr *msg, size_t length)
> > +{
> > +     struct homa_sock *hsk = homa_sk(sk);
> > +     struct homa_sendmsg_args args;
> > +     int result = 0;
> > +     struct homa_rpc *rpc = NULL;
> > +     union sockaddr_in_union *addr = (union sockaddr_in_union *)msg->msg_name;
>
> msg->msg_name can be NULL.

I have added a check for this.

> > +     if (addr->in6.sin6_family != sk->sk_family) {
> > +             result = -EAFNOSUPPORT;
> > +             goto error;
> > +     }
>
> addr->sa.sa_family? Using sin6_family would be odd to me, making it seem like homa can only run with
> IPv6.

The family is in the same place for all protocols so I thought it
would be safe to reference it as in6.sin6_family. But you're right
that it looks odd, so I changed it to addr->sa.sa_family.

> > +             rpc = homa_rpc_new_client(hsk, addr);
> > +             if (IS_ERR(rpc)) {
> > +                     result = PTR_ERR(rpc);
> > +                     rpc = NULL;
> > +                     goto error;
> > +             }
> > +             rpc->completion_cookie = args.completion_cookie;
> > +             result = homa_message_out_fill(rpc, &msg->msg_iter, 1);
> > +             if (result)
> > +                     goto error;
> > +             args.id = rpc->id;
> > +             homa_rpc_unlock(rpc);
>
> Strongly recommend that adding comments with all the unlock part to indicate where it was
> locked from.

I went through the code and added comments in places where the locking
isn't relatively obvious (e.g. it happens through methods without
"lock" in their name, such as homa_rpc_new_client). I also renamed
homa_try_bucket_lock to homa_try_rpc_lock, which will clarify RPC
locking a bit better.

> > +             canonical_dest = canonical_ipv6_addr(addr);
>
> Are you treating all addresses as IPv6 addresses just for the sake of simplicity? It's a bit odd,
> but okay to me.

Yes: Homa uses IPv6 addresses internally for both IPv4 and IPv6; this
allows Homa to work with both IPv4 and IPv6 with only a small amount
of code that is protocol-specific.

> > +int homa_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags,
> > +              int *addr_len)
> > +{
> > +     struct homa_sock *hsk = homa_sk(sk);
> > +     struct homa_recvmsg_args control;
> > +     struct homa_rpc *rpc;
> > +     int result;
> > +
> > +     if (unlikely(!msg->msg_control)) {
> > +             /* This test isn't strictly necessary, but it provides a
> > +              * hook for testing kernel call times.
> > +              */
> > +             return -EINVAL;
> > +     }
> > +     if (msg->msg_controllen != sizeof(control)) {
> > +             result = -EINVAL;
> > +             goto done;
>
> Then you copied an uninitialized control into userspace ...
> Is goto really necessary ? Maybe just return.

Good catch; I changed it to just return. In general, I prefer to
funnel everything through a single cleanup-and-return point rather
than returning directly in some cases and jumping to a cleanup point
in others; I think that makes the code cleaner and more obvious. But,
that doesn't really work here.

>
> > +     }
> > +     if (unlikely(copy_from_user(&control,
> > +                                 (__force void __user *)msg->msg_control,
> > +                                 sizeof(control)))) {
> > +             result = -EFAULT;
> > +             goto done;
>
> Same as above, is goto really necessary ?

I changed it to return directly.

> > +     }
> > +     control.completion_cookie = 0;
> > +
> > +     if (control.num_bpages > HOMA_MAX_BPAGES ||
> > +         (control.flags & ~HOMA_RECVMSG_VALID_FLAGS)) {
> > +             result = -EINVAL;
> > +             goto done;
> > +     }
> > +     homa_pool_release_buffers(hsk->buffer_pool, control.num_bpages,
> > +                               control.bpage_offsets);
>
> homa_pool_release_buffers() quietly ignores erroneous bpage_index values passed in from the
> userspace. This behavior may obscure more complex issues in userspace, and exposure this problem
> could help users identify issues earlier.

Good point; homa_recvmsg now returns -EINVAL if there are erroneous
bpage_index values passed in.


> > +     control.num_bpages = 0;
> > +
> > +     rpc = homa_wait_for_message(hsk, (flags & MSG_DONTWAIT)
> > +                     ? (control.flags | HOMA_RECVMSG_NONBLOCKING)
> > +                     : control.flags, control.id);
> > +     if (IS_ERR(rpc)) {
> > +             /* If we get here, it means there was an error that prevented
> > +              * us from finding an RPC to return. If there's an error in
> > +              * the RPC itself we won't get here.
> > +              */
> > +             result = PTR_ERR(rpc);
> > +             goto done;
> > +     }
> > +     result = rpc->error ? rpc->error : rpc->msgin.length;
>
> A trivial tips.
>
> result = rpc->error ?: rpc->msgin.length;

I wasn't aware that the shorter form existed. However, I find the
longer form a bit more obvious, so I'd prefer to leave that unless you
feel strongly that it should be abbreviated.

> > +
> > +     /* Collect result information. */
> > +     control.id = rpc->id;
> > +     control.completion_cookie = rpc->completion_cookie;
> > +     if (likely(rpc->msgin.length >= 0)) {
> > +             control.num_bpages = rpc->msgin.num_bpages;
> > +             memcpy(control.bpage_offsets, rpc->msgin.bpage_offsets,
> > +                    sizeof(control.bpage_offsets));
>
> A trivial tips.
>
> Although sizeof(control.bpage_offsets) and sizeof(rpc->msgin.bpage_offsets) are the same, but
> passing sizeof(rpc->msgin.bpage_offsets) would be more appropriate.

Done.

Thanks for the comments; lots of good caches there.

-John-





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux