On Monday June 16, jmolina@xxxxxxxx wrote: > > During the grow process, this system slowly went unresponsive, and I > was forced to reboot it after about 30 hours. At first I was not > able to run any mdadm commands to see the status of the grow (about > 30 minutes after starting), then I was not able to log in with a new > shell, then after about 24 hours I was able to use a previously > opened shell to see that tons of CRON jobs and other work had backed > up, however during all of this time the system was still acting as > an IP router doing NAT. Finally, after about 30 hours, the dhcpd > daemon stopped giving out leases and then finally traffic stopped > and I could not ping the host any longer (not a lease problem). This is a bit of a worry. It sounds like the system was running out of memory. It would seem to suggest that either the reshape process was leaking memory, or that it was blocking writeout somehow so that other memory wasn't getting freed. However I cannot measure it doing either of these things. If you can reproduce this, I'd love to see the content of /proc/meminfo /proc/slabinfo /proc/slab_allocators at 5 minutes intervals. But I don't expect you'll want to try that experiment :-) NeilBrown -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html