Re: Brick Goes Offline After server reboot/Or Gluster Container is restarted, on which a gluster node is running

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On February 28, 2020 4:49:45 PM GMT+02:00, Rifat Ucal <rucal@xxxxxxxx> wrote:
>Hi Gluster Team,
>
>
>I am trying to implment gluster fs in podman containers, which is
>running except the problems described below.
>
>
>My observations:
>
>- The bricks on the server is gong offline when one of the podman
>container is restarted or the appropriate server is rebooted.
>
>- Althoug the status of the bricks are offline, the replication seems
>to be working, as data will be replicated.
>
>- I see that the replicated data will also be replicated on arbiter
>node, where I was expecting to see only meta data.
>
> 
>
>My configuration.
>
>I created glusterfs for replication in 3 nodes on centos7 but in podman
>containers
>
>The containers in the first and second nodes should be normal
>replication and 3rd node arbiter node.
>
>After creation replication and enabling heal processes I can see also
>that 3rd node is marked as arbiter node.
>
>According to description of arbiter, the arbiter node should store only
>metadata but in my configuration the replicated data will be stored in
>all bricks including arbiter node.
>
> 
>
>Questions:
>
>When rebooting one of the server or restarting one of the glusterfs
>container the restarted container is not going online until gluster
>volume is stopped and started again. Is it a solution inbetween to
>resolve this problem?
>
>-why Arbiter node stores all the data, allthough it should only have
>some metadata to restore the replicated data on other nodes. I would
>not have problem that replication is done in all three nodes. I just
>need to know 
>
>- Can you give me feedback, whether some one experience or similar
>porblems with glusterfs implemented in podman containers?
>
>
>Here are my configurations:
>
>on all containers I have :CentOS Linux release 7.7.1908 glusterfs
>version 7.3 and systemctl is enabled for glusterd service
>
>
>My gluster volume creation:
>
>gluster volume create cgvol1 replica 2 arbiter 1 transport tcp
>avm1:/cbricks/brick1/data avm2:/cbricks/brick1/data
>dvm1:/cbricks/brick1/data force
>
>
>gluster peer status excuted on avm2:
>Number of Peers: 2
>
>Hostname: avm1
>Uuid: 5d1dc6a7-8f34-45a3-a7c9-c69c442b66dc
>State: Peer in Cluster (Connected)
>
>Hostname: dvm1
>Uuid: 310ffd58-28ab-43f1-88d3-1e381bd46ab3
>State: Peer in Cluster (Connected)
>
>
>gluster volume info
>
>Volume Name: cgvol1
>Type: Replicate
>Volume ID: da975178-b68f-410c-884c-a7f635e4381a
>Status: Started
>Snapshot Count: 0
>Number of Bricks: 1 x (2 + 1) = 3
>Transport-type: tcp
>Bricks:
>Brick1: arvm1:/cbricks/brick1/data
>Brick2: avm2:/cbricks/brick1/data
>Brick3: devm1:/cbricks/brick1/data (arbiter)
>Options Reconfigured:
>cluster.self-heal-daemon: on
>cluster.entry-self-heal: on
>cluster.metadata-self-heal: on
>cluster.data-self-heal: on
>transport.address-family: inet
>storage.fips-mode-rchecksum: on
>nfs.disable: on
>performance.client-io-threads: off
>
>
>gluster volume status
>Status of volume: cgvol1
>Gluster process TCP Port RDMA Port Online Pid
>
>------------------------------------------------------------------------------
>Brick avm1:/cbricks/brick1/data 49152 0 Y 516
>Brick avm2:/cbricks/brick1/data 49152 0 Y 353
>Brick dvm1:/cbricks/brick1/data 49152 0 Y 572
>Self-heal Daemon on localhost N/A N/A Y 537
>Self-heal Daemon on dvm1 N/A N/A Y 593
>Self-heal Daemon on avm2 N/A N/A Y 374
>
>Task Status of Volume cgvol1
>------------------------------------------------------------------------------
>There are no active volume tasks
>
>gluster volume heal cgvol1 info
>Brick avm1:/cbricks/brick1/data
>Status: Connected
>Number of entries: 0
>
>Brick avm2:/cbricks/brick1/data
>Status: Connected
>Number of entries: 0
>
>Brick dvm1:/cbricks/brick1/data
>Status: Connected
>Number of entries: 0
>
>
>
>Best Regards,
>
>Rifat Ucal
>
>
>
>> Jorick Astrego <jorick@xxxxxxxxxxx> hat am 14. Februar 2020 um 10:10
>geschrieben:
>> 
>> 
>>     Hi,
>> 
>>     It looks like you have a two node setup?
>> 
>>     Then it's expected as with two nodes you don't have quorum and
>this can lead to split brains.
>> 
>>     To have HA, add another node or an arbiter node.
>> 
>>    
>https://docs.gluster.org/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/
>> 
>>     You can also modify the quorum but then you shouldn't be too
>attachted to the data you have on it.
>> 
>>     Regards, Jorick
>> 
>>     On 2/14/20 9:27 AM, Cloud Udupi wrote:
>> 
>>         > >         Hi,
>> > 
>> >         I am new to glusterfs. I have used this guide on
>CentOS-7.6. 
>> >        
>https://microdevsys.com/wp/glusterfs-configuration-and-setup-w-nfs-ganesha-for-an-ha-nfs-cluster/
>> > 
>> >         glusterfs —version
>> >         glusterfs 7.2
>> > 
>> >         Firewall is disabled. Self heal is enabled.
>> >         Everything works fine until I reboot one of the servers.
>When the server reboots the brick doesn't come online.
>> > 
>> >         gluster volume status
>> > 
>> >         Status of volume: gv01
>> >         Gluster process                             TCP Port  RDMA
>Port  Online  Pid
>> >        
>------------------------------------------------------------------------------
>> >         Brick server1:/bricks/0/gv0                 N/A       N/A  
>     N       N/A  
>> >         Brick server2:/bricks/0/gv0                 49152     0    
>     Y       99870
>> >         Self-heal Daemon on localhost               N/A       N/A  
>     Y       109802
>> >         Self-heal Daemon on server1                 N/A       N/A  
>     Y       2142 
>> >          
>> >         Task Status of Volume gv01
>> >        
>------------------------------------------------------------------------------
>> >         There are no active volume tasks
>> > 
>> >         gluster volume heal gv01
>> > 
>> >         Launching heal operation to perform index self heal on
>volume gv01 has been unsuccessful:
>> >          
>> >         Glusterd Syncop Mgmt brick op 'Heal' failed. Please check
>glustershd log file for details.
>> > 
>> >         gluster volume heal gv01 info
>> > 
>> >          gluster volume heal gv01 info
>> >         Brick server1:/bricks/0/gv0
>> >         Status: Transport endpoint is not connected
>> >          
>> >         Number of entries: -
>> > 
>> >          
>> >         When I do "gluster volume start gv01 force" brick starts.
>> > 
>> >         I want the brick to come online automatically after the
>reboot. I have attached log file.
>> >         Please help.
>> > 
>> >         Regards,
>> >         Mark.
>> > 
>> > 
>> >         ________
>> > 
>> >         Community Meeting Calendar:
>> > 
>> >         APAC Schedule -
>> >         Every 2nd and 4th Tuesday at 11:30 AM IST
>> >         Bridge: https://bluejeans.com/441850968
>> > 
>> >         NA/EMEA Schedule -
>> >         Every 1st and 3rd Tuesday at 01:00 PM EDT
>> >         Bridge: https://bluejeans.com/441850968
>> > 
>> >         Gluster-users mailing list
>> >         Gluster-users@xxxxxxxxxxx mailto:Gluster-users@xxxxxxxxxxx
>> >         https://lists.gluster.org/mailman/listinfo/gluster-users
>> > 
>> >     > 
>> 
>> 
>>     Met vriendelijke groet, With kind regards,
>> 
>>     Jorick Astrego
>> 
>>     Netbulae Virtualization Experts
>> 
>>     ---------------------------------------------
>>     Tel: 053 20 30 270 	info@xxxxxxxxxxx 	Staalsteden 4-3A 	KvK
>08198180
>>     Fax: 053 20 30 271 	www.netbulae.eu 	7547 TA Enschede 	BTW
>NL821234584B01
>> 
>> 
>>     ---------------------------------------------
>> 
>> 
>
>
> 
>
>> ________
>> 
>>     Community Meeting Calendar:
>> 
>>     APAC Schedule -
>>     Every 2nd and 4th Tuesday at 11:30 AM IST
>>     Bridge: https://bluejeans.com/441850968
>> 
>>     NA/EMEA Schedule -
>>     Every 1st and 3rd Tuesday at 01:00 PM EDT
>>     Bridge: https://bluejeans.com/441850968
>> 
>>     Gluster-users mailing list
>>     Gluster-users@xxxxxxxxxxx
>>     https://lists.gluster.org/mailman/listinfo/gluster-users
>> 
>
>
> 

Hi Rifat,

Can you reproduce the same behaviour on VMs or physical machines ?
If yes, then it could be an issue in the gluster version you are using, otherwise it will be related to the containerization of gluster.

Best Regards,
Strahil Nikolov
________



Community Meeting Calendar:

Schedule -
Every Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users




[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux