Re: [RFC PATCH net-next 0/5] net: In-kernel QUIC implementation with Userspace handshake

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 19, 2024 at 10:07 AM Stefan Metzmacher <metze@xxxxxxxxx> wrote:
>
> Hi Xin Long,
>
> >>>>>> first many thanks for working on this topic!
> >>>>>>
> >>>>> Hi, Stefan
> >>>>>
> >>>>> Thanks for the comment!
> >>>>>
> >>>>>>> Usage
> >>>>>>> =====
> >>>>>>>
> >>>>>>> This implementation supports a mapping of QUIC into sockets APIs. Similar
> >>>>>>> to TCP and SCTP, a typical Server and Client use the following system call
> >>>>>>> sequence to communicate:
> >>>>>>>
> >>>>>>>            Client                    Server
> >>>>>>>         ------------------------------------------------------------------
> >>>>>>>         sockfd = socket(IPPROTO_QUIC)      listenfd = socket(IPPROTO_QUIC)
> >>>>>>>         bind(sockfd)                       bind(listenfd)
> >>>>>>>                                            listen(listenfd)
> >>>>>>>         connect(sockfd)
> >>>>>>>         quic_client_handshake(sockfd)
> >>>>>>>                                            sockfd = accecpt(listenfd)
> >>>>>>>                                            quic_server_handshake(sockfd, cert)
> >>>>>>>
> >>>>>>>         sendmsg(sockfd)                    recvmsg(sockfd)
> >>>>>>>         close(sockfd)                      close(sockfd)
> >>>>>>>                                            close(listenfd)
> >>>>>>>
> >>>>>>> Please note that quic_client_handshake() and quic_server_handshake() functions
> >>>>>>> are currently sourced from libquic in the github lxin/quic repository, and might
> >>>>>>> be integrated into ktls-utils in the future. These functions are responsible for
> >>>>>>> receiving and processing the raw TLS handshake messages until the completion of
> >>>>>>> the handshake process.
> >>>>>>
> >>>>>> I see a problem with this design for the server, as one reason to
> >>>>>> have SMB over QUIC is to use udp port 443 in order to get through
> >>>>>> firewalls. As QUIC has the concept of ALPN it should be possible
> >>>>>> let a conumer only listen on a specif ALPN, so that the smb server
> >>>>>> and web server on "h3" could both accept connections.
> >>>>> We do provide a sockopt to set ALPN before bind or handshaking:
> >>>>>
> >>>>>      https://github.com/lxin/quic/wiki/man#quic_sockopt_alpn
> >>>>>
> >>>>> But it's used more like to verify if the ALPN set on the server
> >>>>> matches the one received from the client, instead of to find
> >>>>> the correct server.
> >>>>
> >>>> Ah, ok.
> >>> Just note that, with a bit change in the current libquic, it still
> >>> allows users to use ALPN to find the correct function or thread in
> >>> the *same* process, usage be like:
> >>>
> >>> listenfd = socket(IPPROTO_QUIC);
> >>> /* match all during handshake with wildcard ALPN */
> >>> setsockopt(listenfd, QUIC_SOCKOPT_ALPN, "*");
> >>> bind(listenfd)
> >>> listen(listenfd)
> >>>
> >>> while (1) {
> >>>     sockfd = accept(listenfd);
> >>>     /* the alpn from client will be set to sockfd during handshake */
> >>>     quic_server_handshake(sockfd, cert);
> >>>
> >>>     getsockopt(sockfd, QUIC_SOCKOPT_ALPN, alpn);
> >>
> >> Would quic_server_handshake() call setsockopt()?
> > Yes, I just made a bit change in the userspace libquic:
> >
> >    https://github.com/lxin/quic/commit/9c75bd42769a8cbc1652e2f4c8d77780f23afde6
> >
> > So you can set up multple ALPNs on listen sock:
> >
> >    setsockopt(listenfd, QUIC_SOCKOPT_ALPN, "smbd, h3, ksmbd");
> >
> > Then during handshake, the matched ALPN from client will be set into
> > the accept socket, then users can get it later after handshake.
> >
> > Note that userspace libquic is a very light lib (a couple of hundred lines
> > of code), you can add more TLS related support without touching Kernel code,
> > including the SNI support you mentioned.
> >
> >>
> >>>     switch (alpn) {
> >>>       case "smbd": smbd_thread(sockfd);
> >>>       case "h3": h3_thread(sockfd);
> >>>       case "ksmbd": ksmbd_thread(sockfd);
> >>>     }
> >>> }
> >>
> >> Ok, but that would mean all application need to be aware of each other,
> >> but it would be possible and socket fds could be passed to other
> >> processes.
> > It doesn't sound common to me, but yes, I think Unix Domain Sockets
> > can pass it to another process.
>
> I think it will be extremely common to have multiple services
> based on udp port 443.
>
> People will expect to find web services, smb and maybe more
> behind the same dnshost name. And multiple dnshostnames pointing
> to the same ip address is also very likely.
>
> With plain tcp/udp it's also possible to independent sockets
> per port. There's no single userspace daemon that listens on
> 'tcp' and will dispatch into different process base on the port.
>
> And with QUIC the port space is the ALPN and/or SNI
> combination.
>
> And I think this should be addressed before this becomes an
> unchangeable kernel ABI, written is stone.
>
> >>>>> So you expect (k)smbd server and web server both to listen on UDP
> >>>>> port 443 on the same host, and which APP server accepts the request
> >>>>> from a client depends on ALPN, right?
> >>>>
> >>>> yes.
> >>> Got you. This can be done by also moving TLS 1.3 message exchange to
> >>> kernel where we can get the ALPN before looking up the listening socket.
> >>> However, In-kernel TLS 1.3 Handshake had been NACKed by both kernel
> >>> netdev maintainers and userland ssl lib developers with good reasons.
> >>>
> >>>>
> >>>>> Currently, in Kernel, this implementation doesn't process any raw TLS
> >>>>> MSG/EXTs but deliver them to userspace after decryption, and the accept
> >>>>> socket is created before processing handshake.
> >>>>>
> >>>>> I'm actually curious how userland QUIC handles this, considering
> >>>>> that the UDP sockets('listening' on the same IP:PORT) are used in
> >>>>> two different servers' processes. I think socket lookup with ALPN
> >>>>> has to be done in Kernel Space. Do you know any userland QUIC
> >>>>> implementation for this?
> >>>>
> >>>> I don't now, but I guess QUIC is only used for http so
> >>>> far and maybe dns, but that seems to use port 853.
> >>>>
> >>>> So there's no strict need for it and the web server
> >>>> would handle all relevant ALPNs.
> >>> Honestly, I don't think any userland QUIC can use ALPN to lookup for
> >>> different sockets used by different servers/processes. As such thing
> >>> can be only done in Kernel Space.
> >>>
> >>>>
> >>>>>>
> >>>>>> So the server application should have a way to specify the desired
> >>>>>> ALPN before or during the bind() call. I'm not sure if the
> >>>>>> ALPN is available in cleartext before any crypto is needed,
> >>>>>> so if the ALPN is encrypted it might be needed to also register
> >>>>>> a server certificate and key together with the ALPN.
> >>>>>> Because multiple application may not want to share the same key.
> >>>>> On send side, ALPN extension is in raw TLS messages created in userspace
> >>>>> and passed into the kernel and encoded into QUIC crypto frame and then
> >>>>> *encrypted* before sending out.
> >>>>
> >>>> Ok.
> >>>>
> >>>>> On recv side, after decryption, the raw TLS messages are decoded from
> >>>>> the QUIC crypto frame and then delivered to userspace, so in userspace
> >>>>> it processes certificate validation and also see cleartext ALPN.
> >>>>>
> >>>>> Let me know if I don't make it clear.
> >>>>
> >>>> But the first "new" QUIC pdu from will trigger the accept() to
> >>>> return and userspace (or the kernel helper function) will to
> >>>> all crypto? Or does the first decryption happen in kernel (before accept returns)?
> >>> Good question!
> >>>
> >>> The first "new" QUIC pdu will cause to create a 'request sock' (contains
> >>> 4-tuple and connection IDs only) and queue up to reqsk list of the listen
> >>> sock (if validate_peer_address param is not set), and this pdu is enqueued
> >>> in the inq->backlog_list of the listen sock.
> >>>
> >>> When accept() is called, in Kernel, it dequeues the "request sock" from the
> >>> reqsk list of the listen sock, and creates the accept socket based on this
> >>> reqsk. Then it processes the pdu for this new accept socket from the
> >>> inq->backlog_list of the listen sock, including *decrypting* QUIC packet
> >>> and decoding CRYPTO frame, then deliver the raw/cleartext TLS message to
> >>> the Userspace libquic.
> >>
> >> Ok, when the kernel already decrypts it could already
> >> look find the ALPN. It doesn't mean it should do the full
> >> handshake, but parse enough to find the ALPN.
> > Correct, in-kernel QUIC should only do the QUIC related things,
> > and all TLS handshake msgs must be handled in Userspace.
> > This won't cause "layering violation", as Nick Banks said.
>
> But I think its unavoidable for the ALPN and SNI fields on
> the server side. As every service tries to use udp port 443
> and somehow that needs to be shared if multiple services want to
> use it.
>
> I guess on the acceptor side we would need to somehow detach low level
> udp struct sock from the logical listen struct sock.
>
> And quic_do_listen_rcv() would need to find the correct logical listening
> socket and call quic_request_sock_enqueue() on the logical socket
> not the lowlevel udo socket. The same for all stuff happening after
> quic_request_sock_enqueue() at the end of quic_do_listen_rcv.
>
The implementation allows one low level UDP sock to serve for multiple
QUIC socks.

Currently, if your 3 quic applications listen to the same address:port
with SO_REUSEPORT socket option set, the incoming connection will choose
one of your applications randomly with hash(client_addr+port) via
reuseport_select_sock() in quic_sock_lookup().

It should be easy to do a further match with ALPN between these 3 quic
socks that listens to the same address:port to get the right quic sock,
instead of that randomly choosing.

The problem is to parse the TLS Client_Hello message to get the ALPN in
quic_sock_lookup(), which is not a proper thing to do in kernel, and
might be rejected by networking maintainers, I need to check with them.

Will you be able to work around this by using Unix Domain Sockets pass
the sockfd to another process?

(Note that we're assuming all your 3 applications are using in-kernel QUIC)

> >> But I don't yet understand how the kernel gets the key to
> >> do the initlal decryption, I'd assume some call before listen()
> >> need to tell the kernel about the keys.
> > For initlal decryption, the keys can be derived with the initial packet.
> > basically, it only needs the dst_connection_id from the client initial
> > packet. see:
> >
> >    https://datatracker.ietf.org/doc/html/rfc9001#name-initial-secrets
> >
> > so we don't need to set up anything to kernel for initial's keys.
>
> I got it thanks!
>
> metze
>





[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux