Hello, I'm testing an environment on AWS right now and running into a strange issue. In summary, my setup is like this: 2 x c4.large (2 visible CPU, 4gb ram) running a 500gb (magnetic backend) Replicate gluster volume. So, each instance has 100% of the data. Running 3.7.2 on servers and multiple clients. Bricks are xfs with noatime mount flag. Server and client threads set at 4 right now (was 2 before, same issue) Currently there are ~800,000 smaller files (jpeg, 300k to 3 meg) on the volume, and one of the clients is writing new files to the volume constantly, on average about 2-3 per second. 100% of the time, there is practically no load. I could run these on micro instances.. but, if I happen to reboot one of them, I run into some serious trouble. Both boxes max out on cpu, load average goes into the 4-6 range, and my client can no longer write to the volume. About 18 minutes later, there are finally log entries added to the glustershd.log file and it begins a self heal on added files. The load calms down about 5-10 minutes after that, and other clients can do reads and writes again. However, my original client that was trying to write the small files is ultimately stuck, I can't even do an ls on a folder without it taking 30+ seconds. Ultimately if I kill off everything that was trying to write, unmount and remount the volume, I can get it functional again. Do I just have too many small files? Would this not happen with gp2 (ssd) bricks? Is there a way to throttle whatever ate up all the cpu so that services can continue with the fully functional brick? I appreciate any insight. Thank you for your bandwidth. Ray _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users