On Wed, Mar 14, 2012 at 06:19:48PM +0800, Amos Kong wrote: > On 14/03/12 02:35, Michael Roth wrote: > >On Wed, Mar 07, 2012 at 06:48:03AM +0800, Amos Kong wrote: > >>Introduce tcp_client_start() by moving original code in > >>tcp_start_outgoing_migration(). > >> > >>Signed-off-by: Amos Kong<akong@xxxxxxxxxx> > >>--- > >> net.c | 41 +++++++++++++++++++++++++++++++++++++++++ > >> qemu_socket.h | 1 + > >> 2 files changed, 42 insertions(+), 0 deletions(-) > >> > >>diff --git a/net.c b/net.c > >>index e90ff23..9afb0d1 100644 > >>--- a/net.c > >>+++ b/net.c > >>@@ -127,6 +127,47 @@ int tcp_server_start(const char *str, int *fd) > >> return ret; > >> } > >> > >>+int tcp_client_start(const char *str, int *fd) > >>+{ > > ... > > Hi Michael, > > > >>+ *fd = qemu_socket(PF_INET, SOCK_STREAM, 0); > >>+ if (fd< 0) { > >>+ perror("socket"); > >>+ return -1; > >>+ } > >>+ socket_set_nonblock(*fd); > >>+ > >>+ for (;;) { > >>+ ret = connect(*fd, (struct sockaddr *)&saddr, sizeof(saddr)); > >>+ if (ret< 0) { > >>+ ret = -socket_error(); > >>+ if (ret == -EINPROGRESS) { > >>+ break; > > > >The previous implementation and your next patch seem to be expecting a break on > >-EWOULDBLOCK/-EAGAIN as well. Was the behavior changed on purpose? > > In original tcp_start_outgoing_migration(): > break: -EINPROGRES > cont : -EINTR or -EWOULDBLOCK > > In original net_socket_connect_init(): > break: -EINPROGRES or -EWOULDBLOCK > cont : -EINTR > > > http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_15.html > EWOULDBLOCK > socket has nonblocking mode set, and there are no pending > connections immediately available. > > So continue to re-connect if EWOULDBLOCK or EINTR returned by > socket_error() in tcp_client_start() > That seems to be for accept(), not connect(). And in the accept()/incoming case I don't think it's an issue to keep retrying. On the connect()/outgoing case I think we need to be careful because we can hang both the monitor and the guest indefinitely if there's an issue affecting outgoing connection attempts on the source-side. It's much safer to fail in this situation rather than loop indefinitely, and originally that's what the code did, albeit somewhat indirectly. That behavior is changed with your implementation: tcp_start_outgoing_migration() originally: ... socket_set_nonblock(s->fd); do { ret = connect(s->fd, (struct sockaddr *)&addr, sizeof(addr)); if (ret == -1) { ret = -socket_error(); } if (ret == -EINPROGRESS || ret == -EWOULDBLOCK) { qemu_set_fd_handler2(s->fd, NULL, NULL, tcp_wait_for_connect, s); return 0; } } while (ret == -EINTR); ... tcp_start_output_migration() with your changes: ... ret = tcp_client_start(host_port, &s->fd); if (ret == -EINPROGRESS || ret == -EWOULDBLOCK) { DPRINTF("connect in progress"); qemu_set_fd_handler2(s->fd, NULL, NULL, tcp_wait_for_connect, s); tcp_client_start(): static int tcp_client_connect(int fd, struct sockaddr_in *saddr) { int ret; do { ret = connect(fd, (struct sockaddr *)saddr, sizeof(*saddr)); if (ret == -1) { ret = -socket_error(); } } while (ret == -EINTR || ret == -EWOULDBLOCK); > > > I'm not > >sure what the proper handling is for -EAGAIN: whether a non-blocking connect() > >can eventually succeed or not. I suspect that it's not, but that previously we > >treated it synonymously with -EINPROGRESS, then eventually got an error via > >getsockopt() before failing the migration. If so, we're now changing the > >behavior to retry until successful, but given the man page entry I don't > >think that's a good idea since you might block indefinitely: > > > > EAGAIN No more free local ports or insufficient > > entries in the routing cache. For AF_INET > > see the description of > > /proc/sys/net/ipv4/ip_local_port_range > > ip(7) for information on how to increase > > the number of local ports. > > > We didn't process EAGAIN specially, you mean EINTR ? I was referring to the EWOULDBLOCK handling, but on linux at least, EAGAIN == EWOULDBLOCK. > > > > > >>+#ifdef _WIN32 > >>+ } else if (ret == -WSAEALREADY || ret == -WSAEINVAL) { > >>+ break; > >>+#endif > >>+ } else if (ret != -EINTR&& ret != -EWOULDBLOCK) { > >>+ perror("connect"); > >>+ closesocket(*fd); > >>+ return ret; > > -EAGAIN would go this path. When EAGAIN == EWOULDBLOCK, it would loop, and I'm not aware of any hosts where this won't be the case. BSD maybe? > > > >>+ } > >>+ } else { > >>+ break; > >>+ } > >>+ } > >>+ > >>+ return ret; > >>+} > >>+ > > -- > Amos. > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html