Re: [PATCH] vsock/virtio: Fix null-ptr-deref in vsock_stream_has_data

Michal Luczaj <mhal@xxxxxxx> · Thu, 19 Dec 2024 17:09:42 +0100

On 12/19/24 16:12, Stefano Garzarella wrote:
> On Thu, 19 Dec 2024 at 16:05, Michal Luczaj <mhal@xxxxxxx> wrote:
>>
>> On 12/19/24 15:48, Stefano Garzarella wrote:
>>> On Thu, 19 Dec 2024 at 15:36, Michal Luczaj <mhal@xxxxxxx> wrote:
>>>>
>>>> On 12/19/24 09:19, Stefano Garzarella wrote:
>>>>> ...
>>>>> I think the best thing though is to better understand how to handle
>>>>> deassign, rather than checking everywhere that it's not null, also
>>>>> because in some cases (like the one in virtio-vsock), it's also
>>>>> important that the transport is the same.
>>>>
>>>> My vote would be to apply your virtio_transport_recv_pkt() patch *and* make
>>>> it impossible-by-design to switch ->transport from non-NULL to NULL in
>>>> vsock_assign_transport().
>>>
>>> I don't know if that's enough, in this case the problem is that some
>>> response packets are intended for a socket, where the transport has
>>> changed. So whether it's null or assigned but different, it's still a
>>> problem we have to handle.
>>>
>>> So making it impossible for the transport to be null, but allowing it
>>> to be different (we can't prevent it from changing), doesn't solve the
>>> problem for us, it only shifts it.
>>
>> Got it. I assumed this issue would be solved by `vsk->transport !=
>> &t->transport` in the critical place(s).
>>
>> (Note that BPF doesn't care if transport has changed; BPF just expects to
>> have _a_ transport.)
>>
>>>> If I'm not mistaken, that would require rewriting vsock_assign_transport()
>>>> so that a new transport is assigned only once fully initialized, otherwise
>>>> keep the old one (still unhurt and functional) and return error. Because
>>>> failing connect() should not change anything under the hood, right?
>>>>
>>>
>>> Nope, connect should be able to change the transport.
>>>
>>> Because a user can do an initial connect() that requires a specific
>>> transport, this one fails maybe because there's no peer with that cid.
>>> Then the user can redo the connect() to a different cid that requires
>>> a different transport.
>>
>> But the initial connect() failing does not change anything under the hood
>> (transport should/could stay NULL).
> 
> Nope, isn't null, it's assigned to a transport, because for example it
> has to send a packet to connect to the remote CID and wait back for a
> response that for example says the CID doesn't exist.

Ahh, I think I get it. So the initial connect() passed
vsock_assign_transport() successfully and then failed deeper in
vsock_connect(), right? That's fine. Let the socket have a useless
transport (a valid pointer nevertheless). Sure, upcoming connect() can
assign a new (possibly useless just as well) transport, but there's no
reason to allow ->transport becoming NULL.

And a pre-connect socket (where ->transport==NULL) is not an issue, because
BPF won't let it in any sockmap, so vsock_bpf_recvmsg() won't be reachable.

Anywa, thanks for explaining,
Michal

PS. Or ignore the above and remove the socket from the sockmap at every
reconnect? Possible unhash abuse:

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 5cf8109f672a..8a65153ee186 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -483,6 +483,10 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
 		if (vsk->transport == new_transport)
 			return 0;
 
+		const struct proto *prot = READ_ONCE(sk->sk_prot);
+		if (prot->unhash)
+			prot->unhash(sk);
+
 		/* transport->release() must be called with sock lock acquired.
 		 * This path can only be taken during vsock_connect(), where we
 		 * have already held the sock lock. In the other cases, this
diff --git a/net/vmw_vsock/vsock_bpf.c b/net/vmw_vsock/vsock_bpf.c
index 4aa6e74ec295..80deb4d70aea 100644
--- a/net/vmw_vsock/vsock_bpf.c
+++ b/net/vmw_vsock/vsock_bpf.c
@@ -119,6 +119,7 @@ static void vsock_bpf_rebuild_protos(struct proto *prot, const struct proto *bas
 	*prot        = *base;
 	prot->close  = sock_map_close;
 	prot->recvmsg = vsock_bpf_recvmsg;
+	prot->unhash = sock_map_unhash;
 	prot->sock_is_readable = sk_msg_is_readable;
 }

>> Then a successful re-connect assigns
>> the transport (NULL -> non-NULL). And it's all good because all I wanted to
>> avoid (because of BPF) was non-NULL -> NULL. Anyway, that's my possibly
>> shallow understanding :)