On 1/31/20 7:10 AM, Neal Cardwell wrote: > On Fri, Jan 31, 2020 at 7:25 AM <sjpark@xxxxxxxxxx> wrote: >> >> From: SeongJae Park <sjpark@xxxxxxxxx> >> >> When closing a connection, the two acks that required to change closing >> socket's status to FIN_WAIT_2 and then TIME_WAIT could be processed in >> reverse order. This is possible in RSS disabled environments such as a >> connection inside a host. >> >> For example, expected state transitions and required packets for the >> disconnection will be similar to below flow. >> >> 00 (Process A) (Process B) >> 01 ESTABLISHED ESTABLISHED >> 02 close() >> 03 FIN_WAIT_1 >> 04 ---FIN--> >> 05 CLOSE_WAIT >> 06 <--ACK--- >> 07 FIN_WAIT_2 >> 08 <--FIN/ACK--- >> 09 TIME_WAIT >> 10 ---ACK--> >> 11 LAST_ACK >> 12 CLOSED CLOSED > > AFAICT this sequence is not quite what would happen, and that it would > be different starting in line 8, and would unfold as follows: > > 08 close() > 09 LAST_ACK > 10 <--FIN/ACK--- > 11 TIME_WAIT > 12 ---ACK--> > 13 CLOSED CLOSED > > >> The acks in lines 6 and 8 are the acks. If the line 8 packet is >> processed before the line 6 packet, it will be just ignored as it is not >> a expected packet, > > AFAICT that is where the bug starts. > > AFAICT, from first principles, when process A receives the FIN/ACK it > should move to TIME_WAIT even if it has not received a preceding ACK. > That's because ACKs are cumulative. So receiving a later cumulative > ACK conveys all the information in the previous ACKs. > > Also, consider the de facto standard state transition diagram from > "TCP/IP Illustrated, Volume 2: The Implementation", by Wright and > Stevens, e.g.: > > https://courses.cs.washington.edu/courses/cse461/19sp/lectures/TCPIP_State_Transition_Diagram.pdf > > This first-principles analysis agrees with the Wright/Stevens diagram, > which says that a connection in FIN_WAIT_1 that receives a FIN/ACK > should move to TIME_WAIT. > > This seems like a faster and more robust solution than installing > special timers. > > Thoughts? This is orthogonal I think. No matter how hard we fix the other side, we should improve the active side. Since we send a RST, sending the SYN a few ms after the RST seems way better than waiting 1 second as if we received no packet at all. Receiving this ACK tells us something about networking health, no need to be very cautious about the next attempt. Of course, if you have a fix for the passive side, that would be nice to review !