On 8/31/22 09:24, Baptiste Jonglez wrote: > Hello, > > I'm trying to multiplex many simultaneous SSH connections through a single > master connection, and I'm hitting a race condition while doing this. > This is not a bug; I'm either hitting a limit in the design of OpenSSH or > misusing it. > > The use-case is to use Ansible to configure many hosts simultaneously, > while all connections need to go through a single "SSH bastion" via ProxyJump. > For efficiency and to avoid hitting MaxStartups limits, I would like to > use a control master for the connection to the bastion, via the following > client configuration: > > Host bastion.example.com > ControlMaster auto > ControlPath /dev/shm/ssh-%h > ControlPersist 30 > > Host !bastion.example.com *.example.com > ProxyJump bastion.example.com > > However, this does not work when making simultaneous connections: all SSH > connections create a new, separate connection to the bastion. Here is a > simple way to reproduce: > > $ for i in {1..3}; do ssh myhost.example.com "sleep 1" & done > ControlSocket /dev/shm/ssh-bastion.example.com already exists, disabling multiplexing > ControlSocket /dev/shm/ssh-bastion.example.com already exists, disabling multiplexing > > What happens is the following: > > 1) each SSH process tries to connect to the control socket and fails > (this is expected, the control socket is not yet bound) > > 2) each SSH process then creates a new SSH connection > > 3) once connected, each process tries to bind to the control socket > > 4a) one process successfully binds the control socket > 4b) all other processes fail to bind the control socket (error message above) > > 5) in both cases, each process is now using its own separate SSH connection to the bastion > > The window for the race condition is between 1) and 4), so it's rather > large: it includes the time to establish a new SSH connection. > > I believe that taking a lock between steps 1) and 4) could solve the issue: > > 1.1) each process tries to take an exclusive lock related to the control socket > 1.1a) one process gets the lock and can continue creating a SSH connection > 1.1b) all other processes wait on the lock; when the lock is released, they > go back to step 1) to connect to the control socket > > 4.1) once the control socket has been bound, the "lucky process" releases the lock > > Does it make sense? Would the project accept a patch implementing this as > an additional option? Not sure if this is related, but I would like to have an option to *only* use the control socket. -- Sincerely, Demi Marie Obenour (she/her/hers)
Attachment:
OpenPGP_0xB288B55FFF9C22C1.asc
Description: OpenPGP public key
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature
_______________________________________________ openssh-unix-dev mailing list openssh-unix-dev@xxxxxxxxxxx https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev