Re: TCP Fast Retransmission Issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

While experimenting with bbr protocol, I manipulated the network conditions by maintaining a high RTT for about one second before abruptly reducing it. Some packets sent during the high RTT phase experienced long delays in reaching the destination, while later packets, benefiting from the lower RTT, arrived earlier. This out-of-order arrival triggered the receiver to generate duplicate acknowledgments (dup ACKs). Due to the low RTT, these dup ACKs quickly reached the sender. Upon receiving three dup ACKs, the sender initiated a fast retransmission for an earlier packet that was not lost but was simply taking longer to arrive. Interestingly, despite the fast-retransmitted packet experienced a lower RTT, the original delayed packet still arrived first. When the receiver received this packet, it sent an ACK for the next packet in sequence. However, upon later receiving the fast-retransmitted packet, an issue arose in its logic for updating the acknowledgment number. As a result, even after the next expected packet was received, the acknowledgment number was not updated correctly. The receiver continued sending dup ACKs, ultimately forcing bbr into the retransmission timeout (RTO) phase.

I generated this issue in linux kernel version 5.15.0-117-generic with Ubuntu 20.04. I attempted to confirm whether the issue persists with the latest Linux kernel. However, I discovered that the behavior of bbr has changed in the most recent kernel version, where it now sends chunks of packets instead of sending them one by one over time. As a result, I was unable to reproduce the specific sequence of events that triggered the bug we identified. Consequently, I could not confirm whether the bug still exists in the latest kernel.

I believe that the issue (if still exists) will have to be resolved in the location net/ipv4/tcp_input.c or something like that. There are so many authors here that I do not know who to CC here. So, sending this email to you. Sorry if this is not the best way to report this issue.

Thanks
Shehab

________________________________________
From: Ahmed, Shehab Sarar
Sent: Saturday, February 1, 2025 1:01 PM
To: stable@xxxxxxxxxxxxxxx
Cc: regressions@xxxxxxxxxxxxxxx
Subject: TCP Fast Retransmission Issue

Hello,

While experimenting with bbr protocol, I manipulated the network conditions by maintaining a high RTT for about one second before abruptly reducing it. Some packets sent during the high RTT phase experienced long delays in reaching the destination, while later packets, benefiting from the lower RTT, arrived earlier. This out-of-order arrival triggered the receiver to generate duplicate acknowledgments (dup ACKs). Due to the low RTT, these dup ACKs quickly reached the sender. Upon receiving three dup ACKs, the sender initiated a fast retransmission for an earlier packet that was not lost but was simply taking longer to arrive. Interestingly, despite the fast-retransmitted packet experienced a lower RTT, the original delayed packet still arrived first. When the receiver received this packet, it sent an ACK for the next packet in sequence. However, upon later receiving the fast-retransmitted packet, an issue arose in its logic for updating the acknowledgment number. As a result, even after the next expected packet was received, the acknowledgment number was not updated correctly. The receiver continued sending dup ACKs, ultimately forcing bbr into the retransmission timeout (RTO) phase.

I generated this issue in linux kernel version 5.15.0-117-generic with Ubuntu 20.04. I attempted to confirm whether the issue persists with the latest Linux kernel. However, I discovered that the behavior of bbr has changed in the most recent kernel version, where it now sends chunks of packets instead of sending them one by one over time. As a result, I was unable to reproduce the specific sequence of events that triggered the bug we identified. Consequently, I could not confirm whether the bug still exists in the latest kernel.

I believe that the issue (if still exists) will have to be resolved in the location net/ipv4/tcp_input.c or something like that. There are so many authors here that I do not know who to CC here. So, sending this email to you. Sorry if this is not the best way to report this issue.

Thanks
Shehab





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux