Re: GFS2 filesystem consistency error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



HI ,
I m having a problem in two node cluster . Secondary node is showing offline after reboot.
CMAN not starting.
below are logs of offline node:-

[root@EI51SPM1 cluster]# clustat
msg_open: Invalid argument
Member Status: Inquorate

Resource Group Manager not running; no service information available.

Membership information not available
[root@EI51SPM1 cluster]# tail -10 /var/log/messages
Feb 24 13:36:23 EI51SPM1 ccsd[25487]: Error while processing connect: Connection refused
Feb 24 13:36:23 EI51SPM1 kernel: CMAN: sending membership request
Feb 24 13:36:27 EI51SPM1 ccsd[25487]: Cluster is not quorate.  Refusing connection.
Feb 24 13:36:27 EI51SPM1 ccsd[25487]: Error while processing connect: Connection refused
Feb 24 13:36:28 EI51SPM1 kernel: CMAN: sending membership request
Feb 24 13:36:32 EI51SPM1 ccsd[25487]: Cluster is not quorate.  Refusing connection.
Feb 24 13:36:32 EI51SPM1 ccsd[25487]: Error while processing connect: Connection refused
Feb 24 13:36:32 EI51SPM1 ccsd[25487]: Cluster is not quorate.  Refusing connection.
Feb 24 13:36:32 EI51SPM1 ccsd[25487]: Error while processing connect: Connection refused
Feb 24 13:36:33 EI51SPM1 kernel: CMAN: sending membership request
[root@EI51SPM1 cluster]#
[root@EI51SPM1 cluster]# cman_tool status
Protocol version: 5.0.1
Config version: 166
Cluster name: IVRS_DB
Cluster ID: 9982
Cluster Member: No
Membership state: Joining
[root@EI51SPM1 cluster]# cman_tool nodes
Node  Votes Exp Sts  Name
[root@EI51SPM1 cluster]#
[root@EI51SPM1 cluster]#



Thanks & Regards,
Shreekanta Jena


On Tue, Feb 23, 2016 at 11:30 PM, Bob Peterson <rpeterso@xxxxxxxxxx> wrote:
----- Original Message -----
> Bob Peterson <rpeterso@xxxxxxxxxx> writes:
>
>
> [...]
>
> > Hi Daniel,
> >
> > I'm downloading the metadata now. I'll let you know what I find.
> > It may take a while because my storage is a bit in flux at the moment.
>
> Ok, thanks a lot for looking at our problems.
>
> Regards.
> --
> Daniel Dehennin
> Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF
> Fingerprint: 3E69 014E 5C23 50E8 9ED6  2AAD CC1E 9E5B 7A6F E2DF

Hi Daniel,

I took a look at that metadata you sent me, but I didn't find any evidence
relating to the problem you posted. Either the corruption happened a long
time prior to your saving of the metadata, or else the metadata was saved
after an fsck.gfs2 fixed (or attempted to fix) the problem?

One thing's for sure: I don't see any evidence of wild file system corruption;
certainly nothing that can account for those errors.

You said the problem seemed to revolve around a gfs2_grow operation, right?
Can you make sure the lvm2 volume group has the clustered bit set?
Please do the "vgs" command and see if that volume has "c" listed in its
flags. If not, it could have caused problems for the gfs2_grow.

I've seen problems like this very rarely. Once was a legitimate bug in
GFS2 that we fixed in RHEL5, but I assume your kernel is newer than that.
The other problem we weren't able to solve because there was no evidence
of what went wrong.

My only working theory is this:

This might be related to the transition between "unlinked" dinodes and
"free". After a file is deleted, it goes to "unlinked" and has to be
transitioned to "free". This sometimes goes wrong because of the way
it needs to check what other nodes in the cluster are doing.

Maybe: If you have three nodes, and a file was unlinked on node 1, then
maybe the internode communication got confused and nodes 2 and 3 both
tried to transition it from Unlinked to Free. That is only a theory, and
there is absolutely no proof. However, I have a set of patches that are
experimental, and not even in the upstream kernel yet (hopefully soon!)
that try to tighten up and fix problems like this. It's much more common
for multiple nodes to try to transition from Unlinked to Free, and they
all fail, leaving the file in an "Unlinked" state.

Regards,

Bob Peterson
Red Hat File Systems

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

-- 
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux