Hi all, In the latest CVS cluster snapshot, the example agent.c is amended to provide a "manual" means of testing server failover. Now, when the agent receives a server connection request from a device mapper target, it attempts to establish a connection and leaves the client on a list if that fails. A little utility, sendagent, was written to send the name and port number of a new snapshot server to the agent over the same local socket that the device mapper targets use. For each of the waiting clients, the agent attempts to connect to the server, and if successful, passes the connection to the client, which resumes processing IO requests. Here is an example test scenario: # Run a standard test that starts a csnap server, creates a snapshot # device, and runs some IO on it. # The test assumes devices /dev/test-origin and /dev/test-snapstore, # they can be symlinks to partitions, devices or files. # Port 8080 may have to be changed to something else if it is in use # on your test machine. make test # Manual failure. IO on the virtual device will hang. killall csnap-server # Tell the agent to attempt reconnection, this fail (no server) ./sendagent @testdev-control localhost:8080 # Start a new snapshot server ./csnap-server /dev/test-origin /dev/test-snapstore 9090 # Tell the agent about it. IO on the virtual device resumes. ./sendagent @testdev-control localhost:9090 # Check it by writing a pattern of 77's to the device ./devspam /dev/mapper/testdev write 1 77 All the bits and pieces are now in place for running the cluster snapshot on a cluster, except: 1) There is some automagic resource instantiation missing as we have discussed. 2) The server lacks an interface with cluster membership, which it requires for the optional 3-message style of snapshot client interface. Since Ben is looking at resource instantiation issues, I'll look at cluster membersip next. Regards, Daniel