Oh, it's our fault. Public_addr and cluster_addr use the same NIC(eth1). But we found during recovering heartbeat may timeout because of busy traffic. I *misunderstood* the mean of heartbeat and use another NIC(eth0) address for heartbeat to avoid timeout.