Re: Permission denied

Neale Ferguson <neale@xxxxxxxxxxxxxx> · Mon, 13 Oct 2014 16:47:14 +0000

Thanks Bob, answers inline...

On 10/13/14, 12:16 PM, "Bob Peterson" <rpeterso@xxxxxxxxxx> wrote:

>----- Original Message -----
>> I would appreciate any debugging suggestions. I¹ve straced
>> dlm_controld/corosync but not gained much clarity.
>> 
>> Neale
>
>Hi Neale,
>
>1. What does it say if you try to mount the GFS2 file system manually
>   rather than from the configured service?
Permissioned denied (I also used resource debug-start and that’s the
message it gets as well). I disabled the resource and then tried mounting
it as well and I was successful once but not a second time. As I
mentioned, on rare occasions both sides do mount on cluster start up,
which is worse than it never mounting!

>2. After the failure, what does dmesg on all the nodes show?
Node 1 -

[256184.632116] dlm: vol1: dlm_recover 15
[256184.633300] dlm: vol1: add member 2
[256184.636944] dlm: vol1: dlm_recover_members 2 nodes
[256184.664495] dlm: vol1: generation 8 slots 2 1:1 2:2
[256184.664531] dlm: vol1: dlm_recover_directory
[256184.668865] dlm: vol1: dlm_recover_directory 0 in 0 new
[256184.703328] dlm: vol1: dlm_recover_directory 10 out 1 messages
[256184.784404] dlm: vol1: dlm_recover 15 generation 8 done: 120 ms
[256184.785050] GFS2: fsid=rh7cluster:vol1.0: recover generation 8 done
[256185.375091] dlm: vol1: dlm_recover 17
[256185.375655] dlm: vol1: dlm_clear_toss 1 done
[256185.376263] dlm: vol1: remove member 2
[256185.376339] dlm: vol1: dlm_recover_members 1 nodes
[256185.376403] dlm: vol1: generation 9 slots 1 1:1
[256185.376430] dlm: vol1: dlm_recover_directory
[256185.376458] dlm: vol1: dlm_recover_directory 0 in 0 new
[256185.376490] dlm: vol1: dlm_recover_directory 0 out 0 messages
[256185.376638] dlm: vol1: dlm_recover_purge 6 locks for 1 nodes
[256185.376664] dlm: vol1: dlm_recover_masters
[256185.376714] dlm: vol1: dlm_recover_masters 0 of 26
[256185.376746] dlm: vol1: dlm_recover_locks 0 out
[256185.376778] dlm: vol1: dlm_recover_locks 0 in
[256185.376831] dlm: vol1: dlm_recover_rsbs 26 done
[256185.377444] dlm: vol1: dlm_recover 17 generation 9 done: 0 ms
[256185.377833] GFS2: fsid=rh7cluster:vol1.0: recover generation 9 done

Node 2 (failing) - 
[256206.973005] GFS2: fsid=rh7cluster:vol1: Trying to join cluster
"lock_dlm", "rh7cluster:vol1"
[256206.973105] GFS2: fsid=rh7cluster:vol1: In gdlm_mount
[256207.019743] dlm: vol1: joining the lockspace group...
[256207.169061] dlm: vol1: group event done 0 0
[256207.169135] dlm: vol1: dlm_recover 1
[256207.170735] dlm: vol1: add member 2
[256207.170822] dlm: vol1: add member 1
[256207.174493] dlm: vol1: dlm_recover_members 2 nodes
[256207.174798] dlm: vol1: join complete
[256207.205167] dlm: vol1: dlm_recover_directory
[256207.208924] dlm: vol1: dlm_recover_directory 10 in 10 new
[256207.245335] dlm: vol1: dlm_recover_directory 0 out 1 messages
[256207.329101] dlm: vol1: dlm_recover 1 generation 8 done: 120 ms
[256207.851390] GFS2: fsid=rh7cluster:vol1: Joined cluster. Now mounting
FS...
[256207.881216] dlm: vol1: leaving the lockspace group...
[256207.947479] dlm: vol1: group event done 0 0
[256207.949530] dlm: vol1: release_lockspace final free

>3. What kernel is this?
>
>I would:
>(1) Check to make sure the file system has enough journals for all nodes.
>    You can do gfs2_edit -p journals <device>. If your version of
>gfs2-utils
>    doesn't have that option, you can alternately do: gfs2_edit -p jindex
><device>
>    and see how many journals are in the index.
3/3 [fc7745eb] 4/21 (0x4/0x15): File    journal0
   4/4 [8b70757d] 5/4127 (0x5/0x101f): File    journal1

It was made via:
mkfs.gfs2 -j 2 -J 16 -r 32 -t rh7cluster:vol1
/dev/mapper/vg_cluster-ha_lv

>(2) Check to make sure the locking protocol is lock_dlm in the file system
>    superblock. You can get that from gfs2_edit -p sb <device>
sb_lockproto          lock_dlm

>(3) Check to make sure the cluster name in the file system superblock
>    matches the configured cluster name. That's also in the superblock
sb_locktable          rh7cluster:vol1

Strangely, while /etc/corosync/corosync.conf has the cluster name
specified, pcs status reports it as blank:

# pcs status
Cluster name: 
Last updated: Mon Oct 13 12:40:47 2014

>

Attachment:
default.xml

Description: default.xml
-- 
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster