На 4 август 2020 г. 22:47:44 GMT+03:00, Computerisms Corporation <bob@xxxxxxxxxxxxxxx> написа: >Hi Strahil, thanks for your response. > >>> >>> I have compiled gluster 7.6 from sources on both servers. >> >> There is a 7.7 version which is fixing somw stuff. Why do you have >to compile it from source ? > >Because I have often found with other stuff in the past compiling from >source makes a bunch of problems go away. software generally works the > >way the developers expect it to if you use the sources, so they are >better able to help if required. so now I generally compile most of my > >center-piece softwares and use packages for all the supporting stuff. Hm... OK. I guess you can try 7.7 whenever it's possible. >> >>> Servers are 6core/3.4Ghz with 32 GB RAM, no swap, and SSD and >gigabit >>> network connections. They are running debian, and are being used as >>> redundant web servers. There is some 3Million files on the Gluster >>> Storage averaging 130KB/file. >> >> This type of workload is called 'metadata-intensive'. > >does this mean the metadata-cache group file would be a good one to >enable? will try. > >waited 10 minutes, no change that I can see. > >> There are some recommendations for this type of workload: >> >https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3/html/administration_guide/small_file_performance_enhancements >> >> Keep an eye on the section that mentions dirty-ratio = 5 >&dirty-background-ration = 2. > >I have actually read that whole manual, and specifically that page >several times. And also this one: > >https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.1/html/administration_guide/small_file_performance_enhancements > >Perhaps I am not understanding it correctly. I tried these suggestions > >before and it got worse, not better. so I have been operating under >the >assumption that maybe these guidelines are not appropriate for newer >versions. Actually, the settings are not changed much, so they should work for you. >But will try again. adjusting the dirty ratios. > >Load average went from around 15 to 35 in about 2-3 minutes, but 20 >minutes later, it is back down to 20. It may be having a minimal >positive impact on cpu, though, I haven't see the main glusterfs go >over >200% since I changed this, an the brick processes are hovering just >below 50% where they were consistently above 50% before. Might just >be >time of day with the system not as busy. > >after watching for 30 minutes, load average is fluctuating between 10 >and 30, but cpu idle appears marginally better on average than it was. > >>> Interestingly, mostly because it is not something I have ever >>> experienced before, software interrupts sit between 1 and 5 on each >>> core, but the last core is usually sitting around 20. Have never >>> encountered a high load average where the si number was ever >>> significant. I have googled the crap out of that (as well as >gluster >>> performance in general), there are nearly limitless posts about what >it >>> >>> is, but have yet to see one thing to explain what to do about it. This is happening on all nodes ? I got a similar situation caused by bad NIC (si in top was way high), but the chance for bad NIC on all servers is very low. You can still patch OS + Firmware on your next maintenance. >> There is an explanation about that in the link I provided above: >> >> Configuring a higher event threads value than the available >processing units could again cause context switches on these threads. >As a result reducing the number deduced from the previous step to a >number that is less that the available processing units is recommended. > >Okay, again, have played with these numbers before and it did not pan >out as expected. if I understand it correctly, I have 3 brick >processes >(glusterfsd), so the "deduced" number should be 3, and I should set it >lower than that, so 2. but it also says "If a specific thread consumes > >more number of CPU cycles than needed, increasing the event thread >count >would enhance the performance of the Red Hat Storage Server." which is > >why I had it at 8. Yeah, but you got only 6 cores and they are not dedicated for gluster only. I think that you need to test with lower values. >but will set it to 2 now. load average is at 17 to start, waiting a >while to see what happens. > >so 15 minutes later, load average is currently 12, but is fluctuating >between 10 and 20, have seen no significant change in cpu usage or >anything else in top. > >now try also changing server.outstanding-rpc-limit to 256 and wait. > >15 minutes later; load has been above 30 but is currently back down to >12. no significant change in cpu. try increasing to 512 and wait. > >15 minutes later, load average is 50. no signficant difference in cpu. > >Software interrupts remain around where they were. wa from top remains > >about where it was. not sure why load average is climbing so high. >changing rpc-limit to 128. > >ugh. 10 minutes later, load average just popped over 100. resetting >rpc-limit. > >now trying cluster.lookup-optimize on, lazy rebalancing (probably a bad > >idea on the live system, but how much worse can it get?) Ya, bad idea, > >80 hours estimated to complete, load is over 50 and server is crawling. > >disabling rebalance and turning lookup-optimize off, for now. > >right now the only suggested parameter I haven't played with is the >performance.io-thread-count, which I currently have at 64. I think that as you have SSDs only, you might have some results by changing this one. >sigh. an hour later load average is 80 and climbing. apache processes > >are numbering in the hundreds and I am constantly having to restart it. > >this brings load average down to 5, but as apache processes climb and >are held open load average gets up to over 100 again with 3-4 minutes, >and system starts going non-responsive. rinse and repeat. > >so followed all the recommendations, maybe the dirty settings had a >small positive impact, but overall system is most definitely worse for >having made the changes. > >I have returned the configs back to how they were except the dirty >settings and the metadata-cache group. increased >performance.cache-size >to 16GB for now, because that is the one thing that seems to help when >I >"tune" (aka make worse) the system. have had to restart apache a >couple >dozen times or more, but after another 30 minutes or so system has >pretty much settled back to how it was before I started. cpu is like I > >originally stated, all 6 cores maxed out most of the time, software >interrupts still have all cpus running around 5 with the last one >consistently sitting around 20-25. Disk is busy but not usually maxed >out. RAM is about half used. network load peaks at about 1/3 >capacity. >load average is between 10 and 20. sites are responding, but sluggish. > >so am I not reading these recommendations and following the >instructions >correctly? am I not waiting long enough after each implementation, >should I be making 1 change per day instead of thinking 15 minutes >should be enough for the system to catch up? I have read the full red >hat documentation and the significant majority of the gluster docs, >maybe I am missing something else there? should these settings have >had >a different effect than they did? > >For what it's worth, I am running ext4 as my underlying fs and I have >read a few times that XFS might have been a better choice. But that is > >not a trivial experiment to make at this time with the system in >production. It's one thing (and still a bad thing to be sure) to >semi-bork the system for an hour or two while I play with >configurations, but would take a day or so offline to reformat and >restore the data. XFS should bring better performance, but if the issue is not in FS -> it won't make a change... What I/O scheduler are you using for the SSDs (you can check via 'cat /sys/block/sdX/queue/scheduler)? >> >> As 'storage.fips-mode-rchecksum' is using sha256, you can try to >disable it - which should use the less cpu intensive md5. Yet, I have >never played with that option ... > >Done. no signficant difference than I can see. > >> Check the RH page about the tunings and try different values for the >event threads. > >in the past I have tried 2, 4, 8, 16, and 32. Playing with just those >I >never noticed that any of them made any difference. Though I might >have >some different options now than I did then, so might try these again >throughout the day... Are you talking about server or client event threads (or both)? >Thanks again for your time Strahil, if you have any more thoughts would > >love to hear them. Can you check if you use 'noatime' for the bricks ? It won't bring any effect on the CPU side, but it might help with the I/O. I see that your indicator for high load is loadavg, but have you actually checked how many processes are in 'R' or 'D' state ? Some monitoring checks can raise loadavg artificially. Also, are you using software mirroring (either mdadm or striped/mirrored LVs )? >> >> >> Best Regards, >> Strahil Nikolov >> >________ > > > >Community Meeting Calendar: > >Schedule - >Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >Bridge: https://bluejeans.com/441850968 > >Gluster-users mailing list >Gluster-users@xxxxxxxxxxx >https://lists.gluster.org/mailman/listinfo/gluster-users ________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users