Currently if we have multiple connections and one of them goes down we will tear down the whole device. However there's no reason we need to do this as we could have other connections that are working fine. Deal with this by keeping track of the state of the different connections, and if we lose one we mark it as dead and send all IO destined for that socket to one of the other healthy sockets. Any outstanding requests that were on the dead socket will timeout and be re-submitted properly.
Hey Josef, are you trying to address link failures? Any reason not to leave DM for this kind of work? Note that the rest of the block drivers implement periodic reconnects and let DM handle multipathing. It took a long time remove all the driver specific multipathing in the stack.