Re: ceph same rbd on multiple client

gjprabu <gjprabu@xxxxxxxxxxxx> · Fri, 23 Oct 2015 14:07:04 +0530

Hi Henrik,

Thanks for your reply, Still we are facing same issue. we found this dmesg logs and this is known logs because our self made down node1 and made up,  this is showing in logs and other then we didn't found error message. Even we do have problem while unmounting. umount process goes to "D" stat and  fsck through fsck.ocfs2: I/O error. If required to run any other command pls let me know. 

ocfs2 version
debugfs.ocfs2 1.8.0

# cat /etc/sysconfig/o2cb
#
# This is a configuration file for automatic startup of the O2CB
# driver.  It is generated by running /etc/init.d/o2cb configure.
# On Debian based systems the preferred method is running
# 'dpkg-reconfigure ocfs2-tools'.
#

# O2CB_STACK: The name of the cluster stack backing O2CB.
O2CB_STACK=o2cb

# O2CB_BOOTCLUSTER: If not empty, the name of a cluster to start.
O2CB_BOOTCLUSTER=ocfs2

# O2CB_HEARTBEAT_THRESHOLD: Iterations before a node is considered dead.
O2CB_HEARTBEAT_THRESHOLD=31

# O2CB_IDLE_TIMEOUT_MS: Time in ms before a network connection is considered dead.
O2CB_IDLE_TIMEOUT_MS=30000

# O2CB_KEEPALIVE_DELAY_MS: Max time in ms before a keepalive packet is sent
O2CB_KEEPALIVE_DELAY_MS=2000

# O2CB_RECONNECT_DELAY_MS: Min time in ms between connection attempts
O2CB_RECONNECT_DELAY_MS=2000

# fsck.ocfs2 -fy /home/build/downloads/
fsck.ocfs2 1.8.0
fsck.ocfs2: I/O error on channel while opening "/zoho/build/downloads/"

dmesg logs

[ 4229.886284] o2dlm: Joining domain A895BC216BE641A8A7E20AA89D57E051 ( 5 ) 1 nodes
[ 4251.437451] o2dlm: Node 3 joins domain A895BC216BE641A8A7E20AA89D57E051 ( 3 5 ) 2 nodes
[ 4267.836392] o2dlm: Node 1 joins domain A895BC216BE641A8A7E20AA89D57E051 ( 1 3 5 ) 3 nodes
[ 4292.755589] o2dlm: Node 2 joins domain A895BC216BE641A8A7E20AA89D57E051 ( 1 2 3 5 ) 4 nodes
[ 4306.262165] o2dlm: Node 4 joins domain A895BC216BE641A8A7E20AA89D57E051 ( 1 2 3 4 5 ) 5 nodes
[316476.505401] (kworker/u192:0,95923,0):dlm_do_assert_master:1717 ERROR: Error -112 when sending message 502 (key 0xc3460ae7) to node 1
[316476.505470] o2cb: o2dlm has evicted node 1 from domain A895BC216BE641A8A7E20AA89D57E051
[316480.437231] o2dlm: Begin recovery on domain A895BC216BE641A8A7E20AA89D57E051 for node 1
[316480.442389] o2cb: o2dlm has evicted node 1 from domain A895BC216BE641A8A7E20AA89D57E051
[316480.442412] (kworker/u192:0,95923,20):dlm_begin_reco_handler:2765 A895BC216BE641A8A7E20AA89D57E051: dead_node previously set to 1, node 3 changing it to 1
[316480.541237] o2dlm: Node 3 (he) is the Recovery Master for the dead node 1 in domain A895BC216BE641A8A7E20AA89D57E051
[316480.541241] o2dlm: End recovery on domain A895BC216BE641A8A7E20AA89D57E051
[316485.542733] o2dlm: Begin recovery on domain A895BC216BE641A8A7E20AA89D57E051 for node 1
[316485.542740] o2dlm: Node 3 (he) is the Recovery Master for the dead node 1 in domain A895BC216BE641A8A7E20AA89D57E051
[316485.542742] o2dlm: End recovery on domain A895BC216BE641A8A7E20AA89D57E051
[316490.544535] o2dlm: Begin recovery on domain A895BC216BE641A8A7E20AA89D57E051 for node 1
[316490.544538] o2dlm: Node 3 (he) is the Recovery Master for the dead node 1 in domain A895BC216BE641A8A7E20AA89D57E051
[316490.544539] o2dlm: End recovery on domain A895BC216BE641A8A7E20AA89D57E051
[316495.546356] o2dlm: Begin recovery on domain A895BC216BE641A8A7E20AA89D57E051 for node 1
[316495.546362] o2dlm: Node 3 (he) is the Recovery Master for the dead node 1 in domain A895BC216BE641A8A7E20AA89D57E051
[316495.546364] o2dlm: End recovery on domain A895BC216BE641A8A7E20AA89D57E051
[316500.548135] o2dlm: Begin recovery on domain A895BC216BE641A8A7E20AA89D57E051 for node 1
[316500.548139] o2dlm: Node 3 (he) is the Recovery Master for the dead node 1 in domain A895BC216BE641A8A7E20AA89D57E051
[316500.548140] o2dlm: End recovery on domain A895BC216BE641A8A7E20AA89D57E051
[316505.549947] o2dlm: Begin recovery on domain A895BC216BE641A8A7E20AA89D57E051 for node 1
[316505.549951] o2dlm: Node 3 (he) is the Recovery Master for the dead node 1 in domain A895BC216BE641A8A7E20AA89D57E051
[316505.549952] o2dlm: End recovery on domain A895BC216BE641A8A7E20AA89D57E051
[316510.551734] o2dlm: Begin recovery on domain A895BC216BE641A8A7E20AA89D57E051 for node 1
[316510.551739] o2dlm: Node 3 (he) is the Recovery Master for the dead node 1 in domain A895BC216BE641A8A7E20AA89D57E051
[316510.551740] o2dlm: End recovery on domain A895BC216BE641A8A7E20AA89D57E051
[316515.553543] o2dlm: Begin recovery on domain A895BC216BE641A8A7E20AA89D57E051 for node 1
[316515.553547] o2dlm: Node 3 (he) is the Recovery Master for the dead node 1 in domain A895BC216BE641A8A7E20AA89D57E051
[316515.553548] o2dlm: End recovery on domain A895BC216BE641A8A7E20AA89D57E051
[316520.555337] o2dlm: Begin recovery on domain A895BC216BE641A8A7E20AA89D57E051 for node 1
[316520.555341] o2dlm: Node 3 (he) is the Recovery Master for the dead node 1 in domain A895BC216BE641A8A7E20AA89D57E051
[316520.555343] o2dlm: End recovery on domain A895BC216BE641A8A7E20AA89D57E051
[316525.557131] o2dlm: Begin recovery on domain A895BC216BE641A8A7E20AA89D57E051 for node 1
[316525.557136] o2dlm: Node 3 (he) is the Recovery Master for the dead node 1 in domain A895BC216BE641A8A7E20AA89D57E051
[316525.557153] o2dlm: End recovery on domain A895BC216BE641A8A7E20AA89D57E051
[316530.558952] o2dlm: Begin recovery on domain A895BC216BE641A8A7E20AA89D57E051 for node 1
[316530.558955] o2dlm: Node 3 (he) is the Recovery Master for the dead node 1 in domain A895BC216BE641A8A7E20AA89D57E051
[316530.558957] o2dlm: End recovery on domain A895BC216BE641A8A7E20AA89D57E051
[316535.560781] o2dlm: Begin recovery on domain A895BC216BE641A8A7E20AA89D57E051 for node 1
[316535.560789] o2dlm: Node 3 (he) is the Recovery Master for the dead node 1 in domain A895BC216BE641A8A7E20AA89D57E051
[316535.560792] o2dlm: End recovery on domain A895BC216BE641A8A7E20AA89D57E051
[319419.525609] o2dlm: Node 1 joins domain A895BC216BE641A8A7E20AA89D57E051 ( 1 2 3 4 5 ) 5 nodes

ps -auxxxxx | grep umount
root     32083 21.8  0.0 125620  2828 pts/14   D+   19:37   0:18 umount /home/build/repository
root     32196  0.0  0.0 112652  2264 pts/8    S+   19:38   0:00 grep --color=auto umount

cat /proc/32083/stack 
[<ffffffff8132ad7d>] o2net_send_message_vec+0x71d/0xb00
[<ffffffff81352148>] dlm_send_remote_unlock_request.isra.2+0x128/0x410
[<ffffffff813527db>] dlmunlock_common+0x3ab/0x9e0
[<ffffffff81353088>] dlmunlock+0x278/0x800
[<ffffffff8131f765>] o2cb_dlm_unlock+0x35/0x50
[<ffffffff8131ecfe>] ocfs2_dlm_unlock+0x1e/0x30
[<ffffffff812a8776>] ocfs2_drop_lock.isra.29.part.30+0x1f6/0x700
[<ffffffff812ae40d>] ocfs2_simple_drop_lockres+0x2d/0x40
[<ffffffff8129b43c>] ocfs2_dentry_lock_put+0x5c/0x80
[<ffffffff8129b4a2>] ocfs2_dentry_iput+0x42/0x1d0
[<ffffffff81204dc2>] __dentry_kill+0x102/0x1f0
[<ffffffff81205294>] shrink_dentry_list+0xe4/0x2a0
[<ffffffff81205aa8>] shrink_dcache_parent+0x38/0x90
[<ffffffff81205b16>] do_one_tree+0x16/0x50
[<ffffffff81206e9f>] shrink_dcache_for_umount+0x2f/0x90
[<ffffffff811efb15>] generic_shutdown_super+0x25/0x100
[<ffffffff811eff57>] kill_block_super+0x27/0x70
[<ffffffff811f02a9>] deactivate_locked_super+0x49/0x60
[<ffffffff811f089e>] deactivate_super+0x4e/0x70
[<ffffffff8120da83>] cleanup_mnt+0x43/0x90
[<ffffffff8120db22>] __cleanup_mnt+0x12/0x20
[<ffffffff81093ba4>] task_work_run+0xc4/0xe0
[<ffffffff81013c67>] do_notify_resume+0x97/0xb0
[<ffffffff817d2ee7>] int_signal+0x12/0x17
[<ffffffffffffffff>] 0xffffffffffffffff

Regards
G.J

 ---- On Fri, 23 Oct 2015 13:41:19 +0530 Henrik Korkuc <lists@xxxxxxxxx> wrote ----

can you paste dmesg and system logs? I       am using 3 node OCFS2 with RBD and had no problems.

       On 15-10-23 08:40, gjprabu wrote:

     _______________________________________________ 
ceph-users mailing list 
ceph-users@xxxxxxxxxxxxxx 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
Hi Frederic,

           Can you give me some solution, we are spending           more time to solve this issue.

Regards
Prabu

---- On Thu, 15 Oct 2015 17:14:13 +0530 Tyler                 Bishop <tyler.bishop@xxxxxxxxxxxxxxxxx> wrote               ----

I don't know enough on ocfs to help.  Sounds like you                 have unconccurent writes though
Sent from TypeMail
On Oct 15, 2015, at 1:53 AM, gjprabu <gjprabu@xxxxxxxxxxxx>                   wrote:
Hi Tyler,

   Can please send me the next setup action                         to be taken on this issue.

Regards
Prabu

---- On Wed, 14 Oct 2015 13:43:29 +0530 gjprabu                             <gjprabu@xxxxxxxxxxxx>                           wrote ----

Hi Tyler,

         Thanks for your reply. We have                               disabled rbd_cache but still issue is                               persist. Please find our configuration                               file.

# cat /etc/ceph/ceph.conf
[global]
fsid =                               944fa0af-b7be-45a9-93ff-b9907cfaee3f
mon_initial_members = integ-hm5,                               integ-hm6, integ-hm7
mon_host =                               192.168.112.192,192.168.112.193,192.168.112.194
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true
osd_pool_default_size = 2

[mon]
mon_clock_drift_allowed = .500

[client]
rbd_cache = false

--------------------------------------------------------------------------------------

 cluster                               944fa0af-b7be-45a9-93ff-b9907cfaee3f
     health HEALTH_OK
     monmap e2: 3 mons at {integ-hm5=192.168.112.192:6789/0,integ-hm6=192.168.112.193:6789/0,integ-hm7=192.168.112.194:6789/0}
            election epoch 480, quorum                               0,1,2 integ-hm5,integ-hm6,integ-hm7
     osdmap e49780: 2 osds: 2 up, 2 in
      pgmap v2256565: 190 pgs, 2 pools,                               1364 GB data, 410 kobjects
            2559 GB used, 21106 GB /                               24921 GB avail
                 190 active+clean
  client io 373 kB/s rd, 13910 B/s wr,                               103 op/s

Regards
Prabu

                                 ---- On Tue, 13 Oct 2015 19:59:38 +0530                                 Tyler Bishop <tyler.bishop@xxxxxxxxxxxxxxxxx>                                 wrote ----

You need to disable RBD                                       caching.

Tyler                                                           Bishop
Chief Technical Officer
 513-299-7108                                                           x10
Tyler.Bishop@xxxxxxxxxxxxxxxxx
If                                                           you are not                                                           the intended                                                           recipient of                                                           this                                                           transmission                                                           you are                                                           notified that                                                           disclosing,                                                           copying,                                                           distributing                                                           or taking any                                                           action in                                                           reliance on                                                           the contents                                                           of this                                                           information is                                                           strictly                                                           prohibited.

From: "gjprabu" <gjprabu@xxxxxxxxxxxx>
To: "Frédéric Nass"                                         <frederic.nass@xxxxxxxxxxxxxxxx>
Cc: "<ceph-users@xxxxxxxxxxxxxx>"                                         <ceph-users@xxxxxxxxxxxxxx>,                                         "Siva Sokkumuthu" <sivakumar@xxxxxxxxxxxx>,                                         "Kamal Kannan                                         Subramani(kamalakannan)" <kamal@xxxxxxxxxxxxxxxx>
Sent: Tuesday,                                         October 13, 2015 9:11:30 AM
Subject: Re:                                          ceph same rbd on                                         multiple client

Hi ,

 We have CEPH  RBD with                                           OCFS2 mounted servers. we are                                           facing i/o errors                                           simultaneously while move the                                           folder using one nodes in the                                           same disk other nodes data                                           replicating with below said                                           error (Copying is not having                                           any problem). Workaround if we                                           remount the partition this                                           issue get resolved but after                                           sometime problem again                                           reoccurred. please help on                                           this issue.

Note : We have total 5                                           Nodes, here two nodes working                                           fine other nodes are showing                                           like below input/output error                                           on moved data's.

ls -althr 
ls: cannot access                                           LITE_3_0_M4_1_TEST:                                           Input/output error 
ls: cannot access                                           LITE_3_0_M4_1_OLD:                                           Input/output error 
total 0 
d????????? ? ? ? ? ?                                           LITE_3_0_M4_1_TEST 
d????????? ? ? ? ? ?                                           LITE_3_0_M4_1_OLD 

Regards
Prabu

---- On Fri, 22 May                                               2015 17:33:04 +0530 Frédéric                                                 Nass <frederic.nass@xxxxxxxxxxxxxxxx>                                               wrote ----

Hi,

Waiting for CephFS,                                                 you can use clustered                                                 filesystem like OCFS2 or                                                 GFS2 on top of RBD                                                 mappings so that each                                                 host can access the same                                                 device and clustered                                                 filesystem.

Regards,

Frédéric.

Le 21/05/2015 16:10,                                                 gjprabu a écrit :

-- 
Frédéric Nass

Sous direction des Infrastructures,
Direction du Numérique,
Université de Lorraine.

Tél : 03.83.68.53.83
_______________________________________________                                                 
ceph-users mailing                                                 list 
ceph-users@xxxxxxxxxxxxxx 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
Hi All,

        We are                                                   using rbd and map the                                                   same rbd image to the                                                   rbd device on two                                                   different client but i                                                   can't see the data                                                   until i umount and                                                   mount -a partition.                                                   Kindly share the                                                   solution for this                                                   issue.

Example
create rbd image                                                   named foo
map foo to                                                   /dev/rbd0 on server                                                   A,   mount /dev/rbd0                                                   to /mnt
map foo to                                                   /dev/rbd0 on server                                                   B,   mount /dev/rbd0                                                   to /mnt

Regards
Prabu

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com