На 20 август 2020 г. 3:46:41 GMT+03:00, Computerisms Corporation <bob@xxxxxxxxxxxxxxx> написа: >Hi Strahil, > >so over the last two weeks, the system has been relatively stable. I >have powered off both servers at least once, for about 5 minutes each >time. server came up, auto-healed what it needed to, so all of that >part is working as expected. > >will answer things inline and follow with more questions: > >>>> Hm... OK. I guess you can try 7.7 whenever it's possible. >>> >>> Acknowledged. > >Still on my list. >> It could be a bad firmware also. If you get the opportunity, flash >the firmware and bump the OS to the max. > >Datacenter says everything was up to date as of installation, not >really >wanting them to take the servers offline for long enough to redo all >the >hardware. > >>>>> more number of CPU cycles than needed, increasing the event thread >>>>> count >>>>> would enhance the performance of the Red Hat Storage Server." >which >>> is >>>>> why I had it at 8. >>>> Yeah, but you got only 6 cores and they are not dedicated for >>> gluster only. I think that you need to test with lower values. > >figured out my magic number for client/server threads, it should be 5. >I set it to 5, observed no change I could attribute to it, so tried 4, >and got the same thing; no visible effect. > >>>>> right now the only suggested parameter I haven't played with is >the >>>>> performance.io-thread-count, which I currently have at 64. >>> not really sure what would be a reasonable value for my system. >> I guess you can try to increase it a little bit and check how is it >going. > >turns out if you try to set this higher than 64, you get an error >saying >64 is the max. > >>>> What I/O scheduler are you using for the SSDs (you can check via >'cat >>> /sys/block/sdX/queue/scheduler)? >>> >>> # cat /sys/block/vda/queue/scheduler >>> [mq-deadline] none >> >> Deadline prioritizes reads in a 2:1 ratio /default tunings/ . You >can consider testing 'none' if your SSDs are good. > >I did this. I would say it did have a positive effect, but it was a >minimal one. > >> I see vda , please share details on the infra as this is very >important. Virtual disks have their limitations and if you are on a VM, > then there might be chance to increase the CPU count. >> If you are on a VM, I would recommend you to use more (in numbers) >and smaller disks in stripe sets (either raid0 via mdadm, or pure >striped LV). >> Also, if you are on a VM -> there is no reason to reorder your I/O >requests in the VM, just to do it again on the Hypervisour. In such >case 'none' can bring better performance, but this varies on the >workload. > >hm, this is a good question, one I have been asking the datacenter for >a >while, but they are a little bit slippery on what exactly it is they >have going on there. They advertise the servers as metal with a >virtual >layer. The virtual layer is so you can log into a site and power the >server down or up, mount an ISO to boot from, access a console, and >some >other nifty things. can't any more, but when they first introduced the > >system, you could even access the BIOS of the server. But apparently, >and they swear up and down by this, it is a physical server, with real >dedicated SSDs and real sticks of RAM. I have found virtio and qemu as > >loaded kernel modules, so certainly there is something virtual >involved, >but other than that and their nifty little tools, it has always acted >and worked like a metal server to me. You can use 'virt-what' binary to find if and what type of Virtualization is used. I have a suspicion you are ontop of Openstack (which uses CEPH), so I guess you can try to get more info. For example, an Openstack instance can have '0x1af4' in '/sys/block/vdX/device/vendor' (replace X with actual device letter). Another check could be: /usr/lib/udev/scsi_id -g -u -d /dev/vda And also, you can try to take a look with smartctl from smartmontools package: smartctl -a /dev/vdX >> All necessary data is in the file attributes on the brick. I doubt >you will need to have access times on the brick itself. Another >possibility is to use 'relatime'. > >remounted all bricks with noatime, no significant difference. > >>> cache unless flush-behind is on. So seems that is a way to throw >ram >>> to >>> it? I put performance.write-behind-window-size: 512MB and >>> performance.flush-behind: on and the whole system calmed down pretty >>> much immediately. could be just timing, though, will have to see >>> tomorrow during business hours whether the system stays at a >reasonable > >Tried increasing this to its max of 1GB, no noticeable change from >512MB. > >The 2nd server is not acting inline with the first server. glusterfsd >processes are running at 50-80% of a core each, with one brick often >going over 200%, where as they usually stick to 30-45% on the first >server. apache processes consume as much as 90% of a core where as >they >rarely go over 15% on the first server, and they frequently stack up to > >having more than 100 running at once, which drives load average up to >40-60. It's very much like the first server was before I found the >flush-behind setting, but not as bad; at least it isn't going >completely >non-responsive. > >Additionally, it is still taking an excessive time to load the first >page of most sites. I am guessing I need to increase read speeds to >fix >this, so I have played with >performance.io-cache/cache-max-file-size(slight positive change), >read-ahead/read-ahead-page-count(negative change till page count set to > >max of 16, then no noticeable difference), and >rda-cache-limit/rda-request-size(minimal positive effect). I still >have >RAM to spare, so would be nice if I could be using it to improve things >on the read side of things, but have found no magic bullet like >flush-behind was. > >I found a good number of more options to try, have been going a little >crazy with them, will post them at the bottom. I found a post that >suggested mount options are also important: > >https://lists.gluster.org/pipermail/gluster-users/2018-September/034937.html > >I confirmed these are in the man pages, so I tried umounting and >re-mounting with the -o option to include these thusly: > >mount -t glusterfs moogle:webisms /Computerisms/ -o >negative-timeout=10,attribute-timeout=30,fopen-keep-cache,direct-io-mode=enable,fetch-attempts=5 > >But I don't think they are working: > >/# mount | grep glus >moogle:webisms on /Computerisms type fuse.glusterfs >(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072) > >would be grateful if there are any other suggestions anyone can think >of. > >root@moogle:/# gluster v info > >Volume Name: webisms >Type: Distributed-Replicate >Volume ID: 261901e7-60b4-4760-897d-0163beed356e >Status: Started >Snapshot Count: 0 >Number of Bricks: 2 x (2 + 1) = 6 >Transport-type: tcp >Bricks: >Brick1: mooglian:/var/GlusterBrick/replset-0/webisms-replset-0 >Brick2: moogle:/var/GlusterBrick/replset-0/webisms-replset-0 >Brick3: moogle:/var/GlusterBrick/replset-0-arb/webisms-replset-0-arb >(arbiter) >Brick4: moogle:/var/GlusterBrick/replset-1/webisms-replset-1 >Brick5: mooglian:/var/GlusterBrick/replset-1/webisms-replset-1 >Brick6: mooglian:/var/GlusterBrick/replset-1-arb/webisms-replset-1-arb >(arbiter) >Options Reconfigured: >performance.rda-cache-limit: 1GB >performance.client-io-threads: off >nfs.disable: on >storage.fips-mode-rchecksum: off >transport.address-family: inet >performance.stat-prefetch: on >network.inode-lru-limit: 200000 >performance.write-behind-window-size: 1073741824 >performance.readdir-ahead: on >performance.io-thread-count: 64 >performance.cache-size: 12GB >server.event-threads: 4 >client.event-threads: 4 >performance.nl-cache-timeout: 600 >auth.allow: xxxxxx >performance.open-behind: off >performance.quick-read: off >cluster.lookup-optimize: off >cluster.rebal-throttle: lazy >features.cache-invalidation: on >features.cache-invalidation-timeout: 600 >performance.cache-invalidation: on >performance.md-cache-timeout: 600 >performance.flush-behind: on >cluster.read-hash-mode: 0 >performance.strict-o-direct: on >cluster.readdir-optimize: on >cluster.lookup-unhashed: off >performance.cache-refresh-timeout: 30 >performance.enable-least-priority: off >cluster.choose-local: on >performance.rda-request-size: 128KB >performance.read-ahead: on >performance.read-ahead-page-count: 16 >performance.cache-max-file-size: 5MB >performance.io-cache: on ________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users