Re: Two VMS as arbiter...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The options that worked best in my tests were as follows, to avoid split-brain

gluster vol set VMS cluster.heal-timeout 20
gluster volume heal VMS enable
gluster vol set VMS cluster.quorum-reads false
gluster vol set VMS cluster.quorum-count 1
gluster vol set VMS network.ping-timeout 2
gluster volume set VMS cluster.favorite-child-policy mtime
gluster volume heal VMS granular-entry-heal enable
gluster volume set VMS cluster.data-self-heal-algorithm full

Here 
gluster volume set VMS cluster.favorite-child-policy mtime
I used "size" but I read in several places that mtime is better ...

I did several and exhaustive tests ... power off hosts, migrating vm, creating folders and files inside the vm ... activating HA etc ...
After the "crash" ie after the host that was restarted / shutdown comes back, the volume looks like this
Brick pve02: / DATA / brick
/images/100/vm-100-disk-0.qcow2 - Possibly undergoing heal
Status: Connected
Number of entries: 1

Indicating that healing is taking place ...
After a few minutes / hours depending on the hardware speed, "possibly undergoing" disappears ...

But at no time was there data loss ...
While possibly undergoing heals I migrate the vm from one side to another also without problems ...

Here in the tests I performed, the healing of a 10G VM HD, having 4G busy, took 30 minutes ...
Remembering that I'm using a virtualbox with 2 vms in it with 2 G of ram each, each vm being a proxmox.
In a real environment this time is much less and also depends on the size of the VM's HD!

Cheers

---
Gilberto Nunes Ferreira


Em qui., 6 de ago. de 2020 às 14:14, Strahil Nikolov <hunter86_bg@xxxxxxxxx> escreveu:
The settings  I got in my group is:
[root@ovirt1 ~]# cat /var/lib/glusterd/groups/virt
performance.quick-read=off
performance.read-ahead=off
performance.io-cache=off
performance.low-prio-threads=32
network.remote-dio=enable
cluster.eager-lock=enable
cluster.quorum-type=auto
cluster.server-quorum-type=server
cluster.data-self-heal-algorithm=full
cluster.locking-scheme=granular
cluster.shd-max-threads=8
cluster.shd-wait-qlength=10000
features.shard=on
user.cifs=off
cluster.choose-local=off
client.event-threads=4
server.event-threads=4
performance.client-io-threads=on

I'm not  sure that sharded  files are  treated  as  big or not.If your  brick disks are faster than your network bandwidth, you can enable 'cluster.choose-local' .

Keep in mind that some  users report issues  with sparse  qcow2  images  during intensive writes (suspected shard  xlator cannot create fast enough the shards -> default shard size (64MB) is way smaller than the RedHat's  supported size  which is 512MB)  and  I would recommend you  to use  preallocated  qcow2  disks as  much as  possible or to bump the shard size.

Sharding was  developed especially for Virt usage.

Consider  using another cluster.favorite-child-policy  , as  all  shards  have the same size.

Best Regards,
Strahil Nikolov



На 6 август 2020 г. 16:37:07 GMT+03:00, Gilberto Nunes <gilberto.nunes32@xxxxxxxxx> написа:
>Oh I see... I was confused because the terms... Now I read this and
>everything becomes clear...
>
>https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Features/shard/
>
>https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.3/html/configuring_red_hat_virtualization_with_red_hat_gluster_storage/chap-hosting_virtual_machine_images_on_red_hat_storage_volumes
>
>
>Should I use cluster.granular-entrey-heal-enable too, since I am
>working
>with big files?
>
>Thanks
>
>---
>Gilberto Nunes Ferreira
>
>(47) 3025-5907
>(47) 99676-7530 - Whatsapp / Telegram
>
>Skype: gilberto.nunes36
>
>
>
>
>
>Em qui., 6 de ago. de 2020 às 09:32, Gilberto Nunes <
>gilberto.nunes32@xxxxxxxxx> escreveu:
>
>> What do you mean "sharding"? Do you mean sharing folders between two
>> servers to host qcow2 or raw vm images?
>> Here I am using Proxmox which uses qemu but not virsh.
>>
>> Thanks
>> ---
>> Gilberto Nunes Ferreira
>>
>> (47) 3025-5907
>> (47) 99676-7530 - Whatsapp / Telegram
>>
>> Skype: gilberto.nunes36
>>
>>
>>
>>
>>
>> Em qui., 6 de ago. de 2020 às 01:09, Strahil Nikolov <
>> hunter86_bg@xxxxxxxxx> escreveu:
>>
>>> As  you mentioned qcow2  files,  check the virt group
>>> (/var/lib/glusterfs/group or something like that). It has optimal
>setttins
>>> for VMs and is used by oVirt.
>>>
>>> WARNING: If you decide to enable the group, which will also enable
>>> sharding, NEVER EVER DISABLE SHARDING -> ONCE ENABLED STAYS ENABLED
>!!!
>>> Sharding helps reduce loocking during replica heals.
>>>
>>> WARNING2: As virt group uses sharding (fixes the size of file into
>shard
>>> size),  you should consider cluster.favorite-child-policy with value
>>> ctime/mtime.
>>>
>>> Best Regards,
>>> Strahil Nikolov
>>>
>>> На 6 август 2020 г. 1:56:58 GMT+03:00, Gilberto Nunes <
>>> gilberto.nunes32@xxxxxxxxx> написа:
>>> >Ok...Thanks a lot Strahil
>>> >
>>> >This gluster volume set VMS cluster.favorite-child-policy size do
>the
>>> >trick
>>> >to me here!
>>> >
>>> >Cheers
>>> >---
>>> >Gilberto Nunes Ferreira
>>> >
>>> >(47) 3025-5907
>>> >(47) 99676-7530 - Whatsapp / Telegram
>>> >
>>> >Skype: gilberto.nunes36
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >Em qua., 5 de ago. de 2020 às 18:15, Strahil Nikolov
>>> ><hunter86_bg@xxxxxxxxx>
>>> >escreveu:
>>> >
>>> >> This could happen if you have pending heals. Did you reboot that
>node
>>> >> recently ?
>>> >> Did you set automatic unsplit-brain ?
>>> >>
>>> >> Check for pending heals and files in splitbrain.
>>> >>
>>> >> If not, you can check
>>> >>
>>>
>>https://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/
>>> >> (look at point 5).
>>> >>
>>> >> Best Regards,
>>> >> Strahil Nikolov
>>> >>
>>> >> На 5 август 2020 г. 23:41:57 GMT+03:00, Gilberto Nunes <
>>> >> gilberto.nunes32@xxxxxxxxx> написа:
>>> >> >I'm in trouble here.
>>> >> >When I shutdown the pve01 server, the shared folder over
>glusterfs
>>> >is
>>> >> >EMPTY!
>>> >> >It's supposed to be a qcow2 file inside it.
>>> >> >The content is show right, just after I power on pve01 backup...
>>> >> >
>>> >> >Some advice?
>>> >> >
>>> >> >
>>> >> >Thanks
>>> >> >
>>> >> >---
>>> >> >Gilberto Nunes Ferreira
>>> >> >
>>> >> >(47) 3025-5907
>>> >> >(47) 99676-7530 - Whatsapp / Telegram
>>> >> >
>>> >> >Skype: gilberto.nunes36
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >Em qua., 5 de ago. de 2020 às 11:07, Gilberto Nunes <
>>> >> >gilberto.nunes32@xxxxxxxxx> escreveu:
>>> >> >
>>> >> >> Well...
>>> >> >> I do the follow:
>>> >> >>
>>> >> >> gluster vol create VMS replica 3 arbiter 1 pve01:/DATA/brick1
>>> >> >> pve02:/DATA/brick1.5 pve01:/DATA/arbiter1.5 pve02:/DATA/brick2
>pv
>>> >> >> e01:/DATA/brick2.5 pve02:/DATA/arbiter2.5 force
>>> >> >>
>>> >> >> And now I have:
>>> >> >> gluster vol info
>>> >> >>
>>> >> >> Volume Name: VMS
>>> >> >> Type: Distributed-Replicate
>>> >> >> Volume ID: 1bd712f5-ccb9-4322-8275-abe363d1ffdd
>>> >> >> Status: Started
>>> >> >> Snapshot Count: 0
>>> >> >> Number of Bricks: 2 x (2 + 1) = 6
>>> >> >> Transport-type: tcp
>>> >> >> Bricks:
>>> >> >> Brick1: pve01:/DATA/brick1
>>> >> >> Brick2: pve02:/DATA/brick1.5
>>> >> >> Brick3: pve01:/DATA/arbiter1.5 (arbiter)
>>> >> >> Brick4: pve02:/DATA/brick2
>>> >> >> Brick5: pve01:/DATA/brick2.5
>>> >> >> Brick6: pve02:/DATA/arbiter2.5 (arbiter)
>>> >> >> Options Reconfigured:
>>> >> >> cluster.quorum-count: 1
>>> >> >> cluster.quorum-reads: false
>>> >> >> cluster.self-heal-daemon: enable
>>> >> >> cluster.heal-timeout: 10
>>> >> >> storage.fips-mode-rchecksum: on
>>> >> >> transport.address-family: inet
>>> >> >> nfs.disable: on
>>> >> >> performance.client-io-threads: off
>>> >> >>
>>> >> >> This values I have put it myself, in order to see if could
>improve
>>> >> >the
>>> >> >> time to make the volume available, when pve01 goes down with
>>> >ifupdown
>>> >> >> cluster.quorum-count: 1
>>> >> >> cluster.quorum-reads: false
>>> >> >> cluster.self-heal-daemon: enable
>>> >> >> cluster.heal-timeout: 10
>>> >> >>
>>> >> >> Nevertheless, it took more than 1 minutes to the volume VMS
>>> >available
>>> >> >in
>>> >> >> the other host (pve02).
>>> >> >> Is there any trick to reduce this time ?
>>> >> >>
>>> >> >> Thanks
>>> >> >>
>>> >> >> ---
>>> >> >> Gilberto Nunes Ferreira
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> Em qua., 5 de ago. de 2020 às 08:57, Gilberto Nunes <
>>> >> >> gilberto.nunes32@xxxxxxxxx> escreveu:
>>> >> >>
>>> >> >>> hum I see... like this:
>>> >> >>> [image: image.png]
>>> >> >>> ---
>>> >> >>> Gilberto Nunes Ferreira
>>> >> >>>
>>> >> >>> (47) 3025-5907
>>> >> >>> (47) 99676-7530 - Whatsapp / Telegram
>>> >> >>>
>>> >> >>> Skype: gilberto.nunes36
>>> >> >>>
>>> >> >>>
>>> >> >>>
>>> >> >>>
>>> >> >>>
>>> >> >>> Em qua., 5 de ago. de 2020 às 02:14, Computerisms Corporation
><
>>> >> >>> bob@xxxxxxxxxxxxxxx> escreveu:
>>> >> >>>
>>> >> >>>> check the example of the chained configuration on this page:
>>> >> >>>>
>>> >> >>>>
>>> >> >>>>
>>> >> >
>>> >>
>>> >
>>>
>https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.3/html/administration_guide/creating_arbitrated_replicated_volumes
>>> >> >>>>
>>> >> >>>> and apply it to two servers...
>>> >> >>>>
>>> >> >>>> On 2020-08-04 8:25 p.m., Gilberto Nunes wrote:
>>> >> >>>> > Hi Bob!
>>> >> >>>> >
>>> >> >>>> > Could you, please, send me more detail about this
>>> >configuration?
>>> >> >>>> > I will appreciate that!
>>> >> >>>> >
>>> >> >>>> > Thank you
>>> >> >>>> > ---
>>> >> >>>> > Gilberto Nunes Ferreira
>>> >> >>>> >
>>> >> >>>> > (47) 3025-5907
>>> >> >>>> > **
>>> >> >>>> > (47) 99676-7530 - Whatsapp / Telegram
>>> >> >>>> >
>>> >> >>>> > Skype: gilberto.nunes36
>>> >> >>>> >
>>> >> >>>> >
>>> >> >>>> >
>>> >> >>>> >
>>> >> >>>> >
>>> >> >>>> > Em ter., 4 de ago. de 2020 às 23:47, Computerisms
>Corporation
>>> >> >>>> > <bob@xxxxxxxxxxxxxxx <mailto:bob@xxxxxxxxxxxxxxx>>
>escreveu:
>>> >> >>>> >
>>> >> >>>> >     Hi Gilberto,
>>> >> >>>> >
>>> >> >>>> >     My understanding is there can only be one arbiter per
>>> >> >replicated
>>> >> >>>> >     set.  I
>>> >> >>>> >     don't have a lot of practice with gluster, so this
>could
>>> >be
>>> >> >bad
>>> >> >>>> advice,
>>> >> >>>> >     but the way I dealt with it on my two servers was to
>use 6
>>> >> >bricks
>>> >> >>>> as
>>> >> >>>> >     distributed-replicated (this is also relatively easy
>to
>>> >> >migrate to
>>> >> >>>> 3
>>> >> >>>> >     servers if that happens for you in the future):
>>> >> >>>> >
>>> >> >>>> >     Server1     Server2
>>> >> >>>> >     brick1      brick1.5
>>> >> >>>> >     arbiter1.5  brick2
>>> >> >>>> >     brick2.5    arbiter2.5
>>> >> >>>> >
>>> >> >>>> >     On 2020-08-04 7:00 p.m., Gilberto Nunes wrote:
>>> >> >>>> >      > Hi there.
>>> >> >>>> >      > I have two physical servers deployed as replica 2
>and,
>>> >> >>>> obviously,
>>> >> >>>> >     I got
>>> >> >>>> >      > a split-brain.
>>> >> >>>> >      > So I am thinking in use two virtual machines,each
>one
>>> >in
>>> >> >>>> physical
>>> >> >>>> >      > servers....
>>> >> >>>> >      > Then this two VMS act as a artiber of gluster
>set....
>>> >> >>>> >      >
>>> >> >>>> >      > Is this doable?
>>> >> >>>> >      >
>>> >> >>>> >      > Thanks
>>> >> >>>> >      >
>>> >> >>>> >      > ________
>>> >> >>>> >      >
>>> >> >>>> >      >
>>> >> >>>> >      >
>>> >> >>>> >      > Community Meeting Calendar:
>>> >> >>>> >      >
>>> >> >>>> >      > Schedule -
>>> >> >>>> >      > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>>> >> >>>> >      > Bridge: https://bluejeans.com/441850968
>>> >> >>>> >      >
>>> >> >>>> >      > Gluster-users mailing list
>>> >> >>>> >      > Gluster-users@xxxxxxxxxxx
>>> >> ><mailto:Gluster-users@xxxxxxxxxxx>
>>> >> >>>> >      >
>>> >https://lists.gluster.org/mailman/listinfo/gluster-users
>>> >> >>>> >      >
>>> >> >>>> >     ________
>>> >> >>>> >
>>> >> >>>> >
>>> >> >>>> >
>>> >> >>>> >     Community Meeting Calendar:
>>> >> >>>> >
>>> >> >>>> >     Schedule -
>>> >> >>>> >     Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>>> >> >>>> >     Bridge: https://bluejeans.com/441850968
>>> >> >>>> >
>>> >> >>>> >     Gluster-users mailing list
>>> >> >>>> >     Gluster-users@xxxxxxxxxxx
>>> ><mailto:Gluster-users@xxxxxxxxxxx>
>>> >> >>>> >   
>https://lists.gluster.org/mailman/listinfo/gluster-users
>>> >> >>>> >
>>> >> >>>>
>>> >> >>>
>>> >>
>>>
>>
________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux