Hello Wei,
On 9/17/21 6:17 AM, Wei Wang wrote:
TCP_FASTOPEN socket option was added by:
commit 8336886f786fdacbc19b719c1f7ea91eb70706d4
TCP_FASTOPEN_CONNECT socket option was added by the following patch
series:
commit 065263f40f0972d5f1cd294bb0242bd5aa5f06b2
commit 25776aa943401662617437841b3d3ea4693ee98a
commit 19f6d3f3c8422d65b5e3d2162e30ef07c6e21ea2
commit 3979ad7e82dfe3fb94a51c3915e64ec64afa45c3
Add detailed description for these 2 options.
Also add descriptions for /proc entry tcp_fastopen and tcp_fastopen_key.
Signed-off-by: Wei Wang <weiwan@xxxxxxxxxx>
Reviewed-by: Yuchung Cheng <ycheng@xxxxxxxxxx>
Thanks for the patch (and the review, Yuchung)!
Please see some comments below.
Cheers,
Alex
---
Change in v2: corrected some format issues
man7/tcp.7 | 110 +++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 110 insertions(+)
diff --git a/man7/tcp.7 b/man7/tcp.7
index 0a7c61a37..5a6fa7f50 100644
--- a/man7/tcp.7
+++ b/man7/tcp.7
@@ -423,6 +423,28 @@ option.
.\" Since 2.4.0-test7
Enable RFC\ 2883 TCP Duplicate SACK support.
.TP
+.IR tcp_fastopen " (Bitmask; default: 0x1; since Linux 3.7)"
+Enables RFC\ 7413 Fast Open support.
+The flag is used as a bitmap with the following values:
+.RS
+.IP 0x1
+Enables client side Fast Open support
+.IP 0x2
+Enables server side Fast Open support
+.IP 0x4
+Allows client side to transmit data in SYN without Fast Open option
+.IP 0x200
+Allows server side to accept SYN data without Fast Open option
+.IP 0x400
+Enables Fast Open on all listeners without
+.B TCP_FASTOPEN
+socket option
+.RE
+.TP
+.IR tcp_fastopen_key " (since Linux 3.7)"
+Set server side RFC\ 7413 Fast Open key to generate Fast Open cookie
+when server side Fast Open support is enabled.
+.TP
.IR tcp_ecn " (Integer; default: see below; since Linux 2.4)"
.\" Since 2.4.0-test7
Enable RFC\ 3168 Explicit Congestion Notification.
@@ -1202,6 +1224,94 @@ Bound the size of the advertised window to this value.
The kernel imposes a minimum size of SOCK_MIN_RCVBUF/2.
This option should not be used in code intended to be
portable.
+.TP
+.BR TCP_FASTOPEN " (since Linux 3.6)"
+This option enables Fast Open (RFC\ 7413) on the listener socket.
+The value specifies the maximum length of pending SYNs
+(similar to the backlog argument in
+.BR listen (2)).
+Once enabled,
+the listener socket grants the TCP Fast Open cookie on incoming
+SYN with TCP Fast Open option.
+.IP
+More importantly it accepts the data in SYN with a valid Fast Open cookie
+and responds SYN-ACK acknowledging both the data and the SYN sequence.
+.BR accept (2)
+returns a socket that is available for read and write when the handshake
+has not completed yet.
+Thus the data exchange can commence before the handshake completes.
+This option requires enabling the server-side support on sysctl
+.IR net.ipv4.tcp_fastopen
+(see above).
+For TCP Fast Open client-side support,
+see
+.BR send (2)
+.B MSG_FASTOPEN
+or
+.B TCP_FASTOPEN_CONNECT
+below.
+.TP
+.BR TCP_FASTOPEN_CONNECT " (since Linux 4.11)"
+This option enables an alternative way to perform Fast Open on the active
+side (client).
+When this option is enabled,
+.BR connect (2)
+would behave differently depending if a Fast Open cookie is available for
+the destination.
+.IP
+If a cookie is not available (i.e. first contact to the destination),
+.BR connect (2)
+behaves as usual by sending a SYN immediately,
+except the SYN would include an empty Fast Open cookie option to solicit a
+cookie.
+.IP
+If a cookie is available,
+.BR connect (2)
+would return 0 immediately but the SYN transmission is defered.
+A subsequent
+.BR write (2)
+or
+.BR sendmsg (2)
+would trigger a SYN with data plus cookie in the Fast Open option.
+In other words,
+the actual connect operation is deferred until data is supplied.
+.IP
+.B Note:
+While this option is designed for convenience,
+enabling it does change the behaviors and might set new
+.I errnos
typo?
errno values?
+of socket calls.
The above is not very clear to me.
+With cookie present,
+.BR write (2)
+/
Does this mean an "or"? If so, prefer the "or".
+.BR sendmsg (2)
+must be called right after
+.BR connect (2)
+in order to send out SYN+data to complete 3WHS and establish connection.
+Calling
+.BR read (2)
+right after
+.BR connect (2)
+without
+.BR write (2)
+will cause the blocking socket to be blocked forever.
+The application should use either
+.B TCP_FASTOPEN_CONNECT
+or
+.BR send (2)
This is not clear to me. So TCP_FASTOPEN_CONNECT can use write(2) and
sendmsg(2) (mentioned above), and TCP_FASTOPEN can only use send(2)? Or
what did you mean?
+with
+.B MSG_FASTOPEN ,
+instead of both on the same connection.
From "The application ...":
Does this have relation with the text just above it? It appears to me
to be a more generic statement that both options shouldn't be mixed, so
maybe a new paragraph is more appropriate.
+.IP
+Here is the typical call flow with this new option:
+ s = socket();
+ setsockopt(s, IPPROTO_TCP, TCP_FASTOPEN_CONNECT, 1, ...);
+ connect(s);
+ write(s); // write() should always follow connect() in order to
+ // trigger SYN to go out
+ read(s)/write(s);
+ ... > + close(s);
See man-pages(7):
Indentation of structure definitions, shell session logs, and
so on
When structure definitions, shell session logs, and so on
are included in running text, indent them by 4 spaces
(i.e., a block enclosed by .in +4n and .in), format them
using the .EX and EE macros, and surround them with suit‐
able paragraph markers (either .PP or .IP). For example:
.PP
.in +4n
.EX
int
main(int argc, char *argv[])
{
return 0;
}
.EE
.in
.PP
.SS Sockets API
TCP provides limited support for out-of-band data,
in the form of (a single byte of) urgent data.
--
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/