Dear All- The excessive CPU load problem seems to have been caused by the problematic upgrade and subsequent downgrade I reported in the following thread. http://www.gluster.org/pipermail/gluster-users/2012-November/034643.html When downgrading to 3.3.0 using yum the glusterfs-server-3.3.1-1 packages removal failed because of an RPM script error. On CentOS-5 servers "yum remove glusterfs-3.3.1-1.el5" did the trick, but on CentOS-6 I had to forcibly remove the package with "rpm -e --noscripts glusterfs-server-3.3.1-1.el6.x86_64". I later discovered that the UUID value in /var/lib/glusterd/glusterd.info on the CentOS-6 servers had changed, and they were listing themselves in the output of "gluster peer status". I found the original UUID's for the CentOS-6 servers by looking at the file names in /var/lib/glusterd/peers on other servers, like this. [root at remus peers]# grep romulus /var/lib/glusterd/peers/* /var/lib/glusterd/peers/cb21050d-05c2-42b3-8660-230954bab324:hostname1=romulus.nerc-essc.ac.uk With glusterd stopped on all servers I changed the "UUID=" line in /var/lib/glusterd/glusterd.info back to the original value for each server. With glusterd running again on all the servers everything seemed to go back to normal, except for a lot of self-heal activity on the servers that had been suffering from the excessive load problem. I presume a lot of xattr errors had been caused by those servers not talking to the others properly while the load was so high. While looking back at what I did in order to write this message, I have just discovered another UUID related problem. On some servers the files in /var/lib/glusterd/peers have the wrong UUID. The "UUID=" line in each of those files should match the file names but on some servers they don't. I haven't noticed any adverse effects yet, except for not being able to do "gluster volume status" on any of the CentOS-6 servers that were messed up by the problematic downgrade to 3.3.0. I suppose I will have to stop glusterd on all the servers again and manually correct these errors on all the servers. I have 21 of them so it will take a while, but it could be worse I suppose. I would be interested to know if there is a quicker way to recover from a mess like this; any suggestions? -Dan. On 10/25/2012 04:34 PM, Dan Bretherton wrote: > Dear All- > I'm not sure this excessive server load has anything to do with the > bricks having been full. I noticed the full bricks while I was > investigating the excessive load, and assumed the two were related. > However despite there being plenty of room on all the bricks the load > on this particular pair of servers has been consistently between 60 > and 80 all week, and this is causing serious problems for users who > are getting repeated I/O errors. The servers are responding so slowly > that GlusterFS isn't working properly, and CLI commands like "gluster > volume stop" just time out when issued on any server. Restarting > glusterd on all servers has no effect. > > Is there any way to limit the load imposed by GlusterFS on a server? > I desperately need to reduce it to a level where GlusterFS can work > properly and talk to the other servers without timing out. > > -Dan. > > > On 10/22/2012 02:03 PM, Dan Bretherton wrote: >> Dear All- >> A replicated pair of servers in my GlusterFS 3.3.0 cluster have been >> experiencing extremely high load for the past few days after a >> replicated brick pair became 100% full. The GlusterFS related load >> on one of the servers was fluctuating at around 60, and this high >> load would swap to the other server periodically. When I noticed the >> full bricks I quickly extended the volume by creating new bricks on >> another server, and manually moved some data off the full bricks to >> create space for write operations. The fix-layout operation seemed >> to start normally but the load then increased even further. The >> server with the high load (then up to about 80) became very slow to >> respond and I noticed a lot of errors in the VOLNAME-rebalance.log >> files like the following. >> >> [2012-10-22 00:35:52.070364] W >> [socket.c:1512:__socket_proto_state_machine] 0-atmos-client-10: >> reading from socket failed. Error (Transport endpoint is not >> connected), peer (192.171.166.92:24052) >> [2012-10-22 00:35:52.070446] E [rpc-clnt.c:373:saved_frames_unwind] >> (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xe7) [0x2b3fd905c547] >> (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb2) >> [0x2b3fd905bf42] >> (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) >> [0x2b3fd905bbfe]))) 0-atmos-client-10: forced unwinding frame >> type(GlusterFS 3.1) op(INODELK(29)) called at 2012-10-22 >> 00:35:45.454529 (xid=0x285951x) >> >> There have also been occasional errors like the following, referring >> to the pair of bricks that became 100% full. >> >> [2012-10-22 01:32:52.827044] W >> [client3_1-fops.c:5517:client3_1_readdir] 0-atmos-client-15: >> (00000000-0000-0000-0000-000000000000) remote_fd is -1. EBADFD >> [2012-10-22 09:49:21.103066] W >> [client3_1-fops.c:5628:client3_1_readdirp] 0-atmos-client-14: >> (00000000-0000-0000-0000-000000000000) remote_fd is -1. EBADFD >> >> The log files from the bricks that were 100% full have a lot of these >> errors in, from the period after I freed up some space on them. >> >> [2012-10-22 00:40:56.246075] E [server.c:176:server_submit_reply] >> (-->/usr/lib64/libglusterfs.so.0(default_inodelk_cbk+0xa4) >> [0x361da23e84] >> (-->/usr/lib64/glusterfs/3.3.0/xlator/debug/io-stats.so(io_stats_inodelk_cbk+0xd8) >> [0x2aaaabd74d48] >> (-->/usr/lib64/glusterfs/3.3.0/xlator/protocol/server.so(server_inodelk_cbk+0x10b) >> [0x2aaaabf9742b]))) 0-: Reply submission failed >> [2012-10-22 00:40:56.246117] I >> [server-helpers.c:629:server_connection_destroy] 0-atmos-server: >> destroyed connection of >> bdan10.nerc-essc.ac.uk-13609-2012/10/21-23:04:53:323865-atmos-client-15-0 >> >> All these errors have only occurred on the replicated pair of servers >> that had suffered from 100% full bricks. I don't know if the errors >> are being caused by the high load (resulting in poor communication >> with other peers for example) or if the high load is the result of >> replication and/or distribution errors. I have tried various things >> to bring the load down, including un-mounting the volume and stopping >> the fix-layout operation, but the only thing that works is stopping >> the volume. Obviously I can't do that for long because people need to >> use the data, but with the load as high as it is data access is very >> slow and users are experiencing a lot of temporary I/O errors. >> Bricks from several volumes are on those servers so everybody in the >> department is being affected by this problem. I thought at first >> that the load was being caused by self-heal operations fixing errors >> caused by write failures that occurred when the bricks were full, but >> it is glusterfs threads that are causing the high load, not glustershd. >> >> Can anyone suggest a way to bring the load down so people can access >> the data properly again? Also, can I trust GlusterFS to eventually >> self-heal the errors causing the above error messages? >> >> Regards, >> -Dan.