On 19 Mar 2014, at 21:11, Sethu Prasad <sethuprasad.in@xxxxxxxxx> wrote:
Master hang in very strange way. I could ssh to it, and see dmesg, but not any other command. Also tcp connection was alive to slaves. So we can’t say that slaves did not receive data.
There was no any failover procedure. Slaves was slaves.
No, after we rebooted master slaves did not reconnected to it. Later i stopped replication on one of the slaves to preserve data its state. So the main question is, under which circumstances slaves can not reconnect to master with error that master is behind. With fsync on, and synchronous* on.
|