On 9/26/23 3:18 PM, Alexandra Winter wrote:
On 26.09.23 05:00, D. Wythe wrote:
You are right. The key point is how to ensure the valid of smc sock during the life time of clc sock, If so, READ_ONCE is good
enough. Unfortunately, I found that there are no such guarantee, so it's still a life-time problem.
Did you discover a scenario, where clc sock could live longer than smc sock?
Wouldn't that be a dangerous scenario in itself? I still have some hope that the lifetime of an smc socket is by design longer
than that of the corresponding tcp socket.
Hi Alexandra,
Yes there is. Considering scenario:
tcp_v4_rcv(skb)
/* req sock */
reqsk = _inet_lookup_skb(skb)
/* listen sock */
sk = reqsk(reqsk)->rsk_listener;
sock_hold(sk);
tcp_check_req(sk)
smc_release /* release
smc listen sock */
__smc_release
smc_close_active() /* smc_sk->sk_state = SMC_CLOSED; */
if
(smc_sk->sk_state == SMC_CLOSED)
smc_clcsock_release();
sock_release(clcsk); /* close clcsock */
sock_put(sk); /* might not the final refcnt */
sock_put(smc_sk) /* might be the final refcnt of smc_sock */
syn_recv_sock(sk...)
/* might be the final refcnt of tcp listen sock */
sock_put(sk);
Fortunately, this scenario only affects smc_syn_recv_sock and
smc_hs_congested, as other callbacks already have locks to protect smc,
which can guarantee that the sk_user_data is either NULL (set in
smc_close_active) or valid under the lock.
Considering the const, maybe
we need to do :
1. hold a refcnt of smc_sock for syn_recv_sock to keep smc sock valid during life time of clc sock
2. put the refcnt of smc_sock in sk_destruct in tcp_sock to release the very smc sock .
In that way, we can always make sure the valid of smc sock during the life time of clc sock. Then we can use READ_ONCE rather
than lock. What do you think ?
I am not sure I fully understand the details what you propose to do. And it is not only syn_recv_sock(), right?
You need to consider all relations between smc socks and tcp socks; fallback to tcp, initial creation, children of listen sockets, variants of shutdown, ... Preferrably a single simple mechanism covers all situations. Maybe there is such a mechanism already today?
(I don't think clcsock->sk->sk_user_data or sk_callback_lock provide this general coverage)
If we really have a gap, a general refcnt'ing on smc sock could be a solution, but needs to be designed carefully.
You are right , we need designed it with care, we will try the
referenced solutions internally first, and I will also send some RFCs so
that everyone can track the latest progress
and make it can be all agreed.
Many thanks to you and the team to help make smc more stable and robust.
Our pleasure 😁. The stability of smc is important to us too.
Best wishes,
D. Wythe