Re: Gluster Performance - 12 Gbps SSDs and 10 Gbps NIC

Gilberto Ferreira <gilberto.nunes32@xxxxxxxxx> · Tue, 12 Dec 2023 16:59:39 -0300

Fuse there some overhead.Take a look at libgfapi:
https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Features/libgfapi/

I know this doc somehow is out of date, but could be a hint

---
Gilberto Nunes Ferreira
(47) 99676-7530 - Whatsapp / Telegram

Em ter., 12 de dez. de 2023 às 16:29, Danny <dbray925+gluster@xxxxxxxxx> escreveu:
Nope, not a caching thing. I've tried multiple different types of fio tests, all produce the same results. Gbps when hitting the disks locally, slow MB\s when hitting the Gluster FUSE mount.

I've been reading up on glustr-ganesha, and will give that a try.

On Tue, Dec 12, 2023 at 1:58 PM Ramon Selga <ramon.selga@xxxxxxxxx> wrote:

    Dismiss my first question: you have SAS
      12Gbps SSDs  Sorry!

    El 12/12/23 a les 19:52, Ramon Selga ha
      escrit:

      May ask you which kind of disks you have
        in this setup? rotational, ssd SAS/SATA, nvme?

        Is there a RAID controller with writeback caching?

        It seems to me your fio test on local brick has a unclear result
        due to some caching.

        Try something like (you can consider to increase test file size
        depending of your caching memory) :

        fio --size=16G --name=test --filename=/gluster/data/brick/wow
        --bs=1M --nrfiles=1 --direct=1 --sync=0 --randrepeat=0
        --rw=write --refill_buffers --end_fsync=1 --iodepth=200
        --ioengine=libaio

      Also remember a replica 3 arbiter 1 volume writes
      synchronously to two data bricks, halving throughput of your
      network backend.

      Try similar fio on gluster mount but I hardly see more than
      300MB/s writing sequentially on only one fuse mount even with nvme
      backend. On the other side, with 4 to 6 clients, you can easily
      reach 1.5GB/s of aggregate throughput 

      To start, I think is better to try with default parameters for
      your replica volume.

      Best regards!

      Ramon

      El 12/12/23 a les 19:10, Danny ha
        escrit:

        Sorry, I noticed that too after I posted, so I
          instantly upgraded to 10. Issue remains. 

          On Tue, Dec 12, 2023 at
            1:09 PM Gilberto Ferreira <gilberto.nunes32@xxxxxxxxx>
            wrote:

            I strongly suggest you update to version 10
              or higher. 

              It's come with significant improvement
              regarding performance.

                            ---

                                Gilberto Nunes Ferreira

                              (47)
                                  99676-7530 - Whatsapp / Telegram

              Em ter., 12 de dez. de
                2023 às 13:03, Danny <dbray925+gluster@xxxxxxxxx>
                escreveu:

                 MTU is already 9000, and as you can see
                  from the IPERF results, I've got a nice, fast
                  connection between the nodes. 

                  On Tue, Dec 12, 2023
                    at 9:49 AM Strahil Nikolov <hunter86_bg@xxxxxxxxx>
                    wrote:

                     Hi,

                      Let’s try the simple things:

                      Check if you can use MTU9000 and if it’s
                        possible, set it on the Bond Slaves and the bond
                        devices:
                       ping GLUSTER_PEER -c 10
                          -M do -s 8972

                        Then try to follow up the recommendations from https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/html/administration_guide/chap-configuring_red_hat_storage_for_enhancing_performance 

                        Best Regards,
                      Strahil Nikolov 

                        On
                          Monday, December 11, 2023, 3:32 PM, Danny <dbray925+gluster@xxxxxxxxx>
                          wrote:

                              Hello list, I'm hoping someone can
                                let me know what setting I missed.

                              Hardware:
                              Dell R650 servers, Dual 24 Core Xeon
                                2.8 GHz, 1 TB RAM

                              8x SSD s Negotiated Speed
                                12 Gbps
                              PERC H755 Controller - RAID 6 

                              Created virtual "data" disk from the
                                above 8 SSD drives, for a ~20 TB
                                /dev/sdb

                              OS:
                              CentOS Stream
                              kernel-4.18.0-526.el8.x86_64
                              glusterfs-7.9-1.el8.x86_64

                              IPERF Test between nodes:

                                [ ID] Interval           Transfer    
                                Bitrate         Retr

                                [  5]   0.00-10.00  sec  11.5 GBytes
                                 9.90 Gbits/sec    0             sender

                                [  5]   0.00-10.04  sec  11.5 GBytes
                                 9.86 Gbits/sec                
                                 receiver

                              All good there. ~10 Gbps, as
                                expected.

                              LVM Install:
                              export DISK="/dev/sdb"

                                sudo parted --script $DISK "mklabel gpt"

                                sudo parted --script $DISK "mkpart
                                primary 0% 100%"

                                sudo parted --script $DISK "set 1 lvm
                                on"
                              sudo pvcreate --dataalignment 128K
                                /dev/sdb1

                                sudo vgcreate --physicalextentsize 128K
                                gfs_vg /dev/sdb1

                                sudo lvcreate -L 16G -n gfs_pool_meta
                                gfs_vg

                                sudo lvcreate -l 95%FREE -n gfs_pool
                                gfs_vg

                                sudo lvconvert --chunksize 1280K
                                --thinpool gfs_vg/gfs_pool
                                --poolmetadata gfs_vg/gfs_pool_meta

                                sudo lvchange --zero n gfs_vg/gfs_pool

                                sudo lvcreate -V 19.5TiB --thinpool
                                gfs_vg/gfs_pool -n gfs_lv

                                sudo mkfs.xfs -f -i size=512 -n
                                size=8192 -d su=128k,sw=10
                                /dev/mapper/gfs_vg-gfs_lv

                                sudo vim /etc/fstab
                              /dev/mapper/gfs_vg-gfs_lv  
                                /gluster/data/brick   xfs      
                                rw,inode64,noatime,nouuid 0 0

                              sudo systemctl daemon-reload
                                && sudo mount -a

                                fio --name=test
                                --filename=/gluster/data/brick/wow
                                --size=1G --readwrite=write

                              Run status group 0 (all jobs):

                                  WRITE: bw=2081MiB/s (2182MB/s),
                                2081MiB/s-2081MiB/s (2182MB/s-2182MB/s),
                                io=1024MiB (1074MB), run=492-492msec

                              All good there. 2182MB/s =~ 17.5
                                Gbps. Nice!

                              Gluster install:
                              export NODE1='10.54.95.123'

                                export NODE2='10.54.95.124'

                                export NODE3='10.54.95.125'

                                sudo gluster peer probe $NODE2

                                sudo gluster peer probe $NODE3

                                sudo gluster volume create data replica
                                3 arbiter 1 $NODE1:/gluster/data/brick
                                $NODE2:/gluster/data/brick
                                $NODE3:/gluster/data/brick force

                                sudo gluster volume set data
                                network.ping-timeout 5

                                sudo gluster volume set data
                                performance.client-io-threads on

                                sudo gluster volume set data group
                                metadata-cache

                                sudo gluster volume start data

                                sudo gluster volume info all

                                Volume Name: data

                                Type: Replicate

                                Volume ID:
                                b52b5212-82c8-4b1a-8db3-52468bc0226e

                                Status: Started

                                Snapshot Count: 0

                                Number of Bricks: 1 x (2 + 1) = 3

                                Transport-type: tcp

                                Bricks:

                                Brick1: 10.54.95.123:/gluster/data/brick

                                Brick2: 10.54.95.124:/gluster/data/brick

                                Brick3: 10.54.95.125:/gluster/data/brick
                                (arbiter)

                                Options Reconfigured:

                                network.inode-lru-limit: 200000

                                performance.md-cache-timeout: 600

                                performance.cache-invalidation: on

                                performance.stat-prefetch: on

                                features.cache-invalidation-timeout: 600

                                features.cache-invalidation: on

                                network.ping-timeout: 5

                                transport.address-family: inet

                                storage.fips-mode-rchecksum: on

                                nfs.disable: on

                                performance.client-io-threads: on

                              sudo vim /etc/fstab

                              localhost:/data             /data    
                                            glusterfs defaults,_netdev  
                                   0 0

                              sudo systemctl daemon-reload
                                && sudo mount -a
                              fio --name=test --filename=/data/wow
                                --size=1G --readwrite=write

                              Run status group 0 (all jobs):

                                  WRITE: bw=109MiB/s (115MB/s),
                                109MiB/s-109MiB/s (115MB/s-115MB/s),
                                io=1024MiB (1074MB), run=9366-9366msec

                              Oh no, what's wrong? From 2182MB/s
                                down to only 115MB/s? What am I missing?
                                I'm not expecting the above ~17 Gbps,
                                but I'm thinking it should at least be
                                close(r) to ~10 Gbps. 

                              Any suggestions?

                          ________

                          Community Meeting Calendar:

                          Schedule -

                          Every 2nd and 4th Tuesday at 14:30 IST / 09:00
                          UTC

                          Bridge: https://meet.google.com/cpu-eiue-hvk

                          Gluster-users mailing list

                          Gluster-users@xxxxxxxxxxx

                          https://lists.gluster.org/mailman/listinfo/gluster-users

                ________

                Community Meeting Calendar:

                Schedule -

                Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC

                Bridge: https://meet.google.com/cpu-eiue-hvk

                Gluster-users mailing list

                Gluster-users@xxxxxxxxxxx

                https://lists.gluster.org/mailman/listinfo/gluster-users

        ________

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

________

Community Meeting Calendar:

Schedule -

Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC

Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-users mailing list

Gluster-users@xxxxxxxxxxx

https://lists.gluster.org/mailman/listinfo/gluster-users

________

Community Meeting Calendar:

Schedule -

Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC

Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-users mailing list

Gluster-users@xxxxxxxxxxx

https://lists.gluster.org/mailman/listinfo/gluster-users

________

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users