It was a large, very large directory, with somewhere neer one million
small files, so the rsync took something like 3 to 4 hours. At some
point, all nodes' consoles dispalyed this :
gfs2_quotad:2498 blocked for more that 120 seconds.
"echo 0 > /proc/sys/kernel/hang_task_timeout_secs" disables this message.
and then some debuging info dump.
It means you're writing data faster than GFS can handle it, so it's
pausing on you. We see this semi-regularly here on our clusters.
I suspect Bob Petersen's test kernel or patches from ~2-3 weeks ago will
help you as these contain a bunch of fixes for slow writes.
Hi !
Now, my server are now up into out new colo facility, I did update
gfs2-utils that was published on march 17th, amonng others, and I did not
see that message, as it was at first before the update.
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster