На 5 август 2020 г. 4:53:34 GMT+03:00, Computerisms Corporation <bob@xxxxxxxxxxxxxxx> написа: >Hi Strahil, > >thanks again for sticking with me on this. >> Hm... OK. I guess you can try 7.7 whenever it's possible. > >Acknowledged. > >>> Perhaps I am not understanding it correctly. I tried these >suggestions >>> >>> before and it got worse, not better. so I have been operating under >>> the >>> assumption that maybe these guidelines are not appropriate for newer >>> versions. >> >> Actually, the settings are not changed much, so they should work >for you. > >Okay, then maybe I am doing something incorrectly, or not understanding > >some fundamental piece of things that I should be. To be honest, the documentation seems pretty useless to me. >>>>> Interestingly, mostly because it is not something I have ever >>>>> experienced before, software interrupts sit between 1 and 5 on >each >>>>> core, but the last core is usually sitting around 20. Have never >>>>> encountered a high load average where the si number was ever >>>>> significant. I have googled the crap out of that (as well as >>> gluster >>>>> performance in general), there are nearly limitless posts about >what >>> it >>>>> >>>>> is, but have yet to see one thing to explain what to do about it. >> >> This is happening on all nodes ? >> I got a similar situation caused by bad NIC (si in top was way >high), but the chance for bad NIC on all servers is very low. >> You can still patch OS + Firmware on your next maintenance. > >Yes, but it's not to the same extreme. The other node is currently not > >actually serving anything to the internet, so right now it's only >function is replicated gluster and databases. On the 2nd node there is > >also one core, the first one in this case as opposed to the last one on > >the main node, but it sits between 10 and 15 instead of 20 and 25, and >the remaining cores will be between 0 and 2 instead of 1 and 5. >I have no evidence of any bad hardware, and these servers were both >commissioned only within the last couple of months. But will still >poke >around on this path. It could be a bad firmware also. If you get the opportunity, flash the firmware and bump the OS to the max. >>> more number of CPU cycles than needed, increasing the event thread >>> count >>> would enhance the performance of the Red Hat Storage Server." which >is >>> >>> why I had it at 8. >> >> Yeah, but you got only 6 cores and they are not dedicated for >gluster only. I think that you need to test with lower values. > >Okay, I will change these values a few times over the next couple of >hours and see what happens. > >>> right now the only suggested parameter I haven't played with is the >>> performance.io-thread-count, which I currently have at 64. >> >> I think that as you have SSDs only, you might have some results by >changing this one. > >Okay, will also modify this incrementally. do you think it can go >higher? I think I got this number from a thread on this list, but I am > >not really sure what would be a reasonable value for my system. I guess you can try to increase it a little bit and check how is it going. >>> >>> For what it's worth, I am running ext4 as my underlying fs and I >have >>> read a few times that XFS might have been a better choice. But that >is >>> >>> not a trivial experiment to make at this time with the system in >>> production. It's one thing (and still a bad thing to be sure) to >>> semi-bork the system for an hour or two while I play with >>> configurations, but would take a day or so offline to reformat and >>> restore the data. >> >> XFS should bring better performance, but if the issue is not in FS >-> it won't make a change... >> What I/O scheduler are you using for the SSDs (you can check via 'cat >/sys/block/sdX/queue/scheduler)? > ># cat /sys/block/vda/queue/scheduler >[mq-deadline] none Deadline prioritizes reads in a 2:1 ratio /default tunings/ . You can consider testing 'none' if your SSDs are good. I see vda , please share details on the infra as this is very important. Virtual disks have their limitations and if you are on a VM, then there might be chance to increase the CPU count. If you are on a VM, I would recommend you to use more (in numbers) and smaller disks in stripe sets (either raid0 via mdadm, or pure striped LV). Also, if you are on a VM -> there is no reason to reorder your I/O requests in the VM, just to do it again on the Hypervisour. In such case 'none' can bring better performance, but this varies on the workload. >>> in the past I have tried 2, 4, 8, 16, and 32. Playing with just >those >>> I >>> never noticed that any of them made any difference. Though I might >>> have >>> some different options now than I did then, so might try these again >>> throughout the day... >> >> Are you talking about server or client event threads (or both)? > >It never occurred to me to set them to different values. so far when I > >set one I set the other to the same value. Yeah, this makes sense. >> >>> Thanks again for your time Strahil, if you have any more thoughts >would >>> >>> love to hear them. >> >> Can you check if you use 'noatime' for the bricks ? It won't bring >any effect on the CPU side, but it might help with the I/O. > >I checked into this, and I have nodiratime set, but not noatime. from >what I can gather, it should provide nearly the same benefit >performance >wise while leaving the atime attribute on the files. Never know, I may > >decide I want those at some point in the future. All necessary data is in the file attributes on the brick. I doubt you will need to have access times on the brick itself. Another possibility is to use 'relatime'. >> I see that your indicator for high load is loadavg, but have you >actually checked how many processes are in 'R' or 'D' state ? >> Some monitoring checks can raise loadavg artificially. > >occasionally a batch of processes will be in R state, and I see the D >state show up from time to time, but mostly everything is S. > >> Also, are you using software mirroring (either mdadm or >striped/mirrored LVs )? > >No, single disk. And I opted to not put the gluster on a thinLVM, as I > >don't see myself using the lvm snapshots in this scenario. > >So, we just moved into a quieter time of the day, but maybe I just >stumbled onto something. I was trying to figure out if/how I could >throw more RAM at the problem. gluster docs says write behind is not a > >cache unless flush-behind is on. So seems that is a way to throw ram >to >it? I put performance.write-behind-window-size: 512MB and >performance.flush-behind: on and the whole system calmed down pretty >much immediately. could be just timing, though, will have to see >tomorrow during business hours whether the system stays at a reasonable > >load. > >I will still test the other options you suggested tonight, though, this > >is probably too good to be true. > >Can't thank you enough for your input, Strahil, your help is truly >appreciated! > > > > > > >> >>>> >>>> >>>> Best Regards, >>>> Strahil Nikolov >>>> >>> ________ >>> >>> >>> >>> Community Meeting Calendar: >>> >>> Schedule - >>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>> Bridge: https://bluejeans.com/441850968 >>> >>> Gluster-users mailing list >>> Gluster-users@xxxxxxxxxxx >>> https://lists.gluster.org/mailman/listinfo/gluster-users >________ > > > >Community Meeting Calendar: > >Schedule - >Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >Bridge: https://bluejeans.com/441850968 > >Gluster-users mailing list >Gluster-users@xxxxxxxxxxx >https://lists.gluster.org/mailman/listinfo/gluster-users ________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users