Hi Greg, thank you for your suggestions. Just let me clearify one little thing: It starts working as soon as i load the kernel rbd/ceph module. I do not need to establish any connection based on that modules to the ceph cluster. Just loading the kernel modules with modprobe (before mounting with ceph-fuse) is fully enough to make it work. A native cephfs mount is working out of the box ( which is clear since i need the ceph/rbd kernel module to mount it ). And thats why i dont think its any kind of network/firewall/similar issue. ---- Anyway, i will try to find a way to get more informations out of the cluster. Thank you so far for your time ! -- Mit freundlichen Gruessen / Best regards Oliver Dzombic IP-Interactive mailto:info@xxxxxxxxxxxxxxxxx Anschrift: IP Interactive UG ( haftungsbeschraenkt ) Zum Sonnenberg 1-3 63571 Gelnhausen HRB 93402 beim Amtsgericht Hanau Geschäftsführung: Oliver Dzombic Steuer Nr.: 35 236 3622 1 UST ID: DE274086107 Am 26.01.2016 um 01:10 schrieb Gregory Farnum: > On Mon, Jan 25, 2016 at 3:58 PM, Oliver Dzombic <info@xxxxxxxxxxxxxxxxx> wrote: >> Hi, >> >> i switched now debugging to ms = 10 >> >> when starting the dd i can see in the logs of osd: >> >> 2016-01-26 00:47:16.530046 7f086f404700 1 -- 10.0.0.1:6806/49658 >> :/0 >> pipe(0x1f830000 sd=292 :6806 s=0 pgs=0 cs=0 l=0 c=0x1dc2e9e0).accept >> sd=292 10.0.0.91:56814/0 >> 2016-01-26 00:47:16.530591 7f086f404700 10 -- 10.0.0.1:6806/49658 >> >> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=0 pgs=0 cs=0 l=0 >> c=0x1dc2e9e0).accept peer addr is 10.0.0.91:0/1532 >> 2016-01-26 00:47:16.530709 7f086f404700 10 -- 10.0.0.1:6806/49658 >> >> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=0 pgs=0 cs=0 l=1 >> c=0x1dc2e9e0).accept of host_type 8, policy.lossy=1 policy.server=1 >> policy.standby=0 policy.resetcheck=0 >> 2016-01-26 00:47:16.530724 7f086f404700 10 -- 10.0.0.1:6806/49658 >> >> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=0 pgs=0 cs=0 l=1 >> c=0x1dc2e9e0).accept my proto 24, their proto 24 >> 2016-01-26 00:47:16.530864 7f086f404700 10 -- 10.0.0.1:6806/49658 >> >> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=0 pgs=0 cs=0 l=1 >> c=0x1dc2e9e0).accept: setting up session_security. >> 2016-01-26 00:47:16.530877 7f086f404700 10 -- 10.0.0.1:6806/49658 >> >> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=0 pgs=0 cs=0 l=1 >> c=0x1dc2e9e0).accept new session >> 2016-01-26 00:47:16.530884 7f086f404700 10 -- 10.0.0.1:6806/49658 >> >> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=2 pgs=11 cs=1 l=1 >> c=0x1dc2e9e0).accept success, connect_seq = 1, sending READY >> 2016-01-26 00:47:16.530889 7f086f404700 10 -- 10.0.0.1:6806/49658 >> >> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=2 pgs=11 cs=1 l=1 >> c=0x1dc2e9e0).accept features 37154696925806591 >> 2016-01-26 00:47:16.530912 7f086f404700 10 -- 10.0.0.1:6806/49658 >> >> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=2 pgs=11 cs=1 l=1 >> c=0x1dc2e9e0).register_pipe >> 2016-01-26 00:47:16.530935 7f086f404700 10 -- 10.0.0.1:6806/49658 >> >> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=2 pgs=11 cs=1 l=1 >> c=0x1dc2e9e0).discard_requeued_up_to 0 >> 2016-01-26 00:47:16.530964 7f086f404700 10 -- 10.0.0.1:6806/49658 >> >> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=2 pgs=11 cs=1 l=1 >> c=0x1dc2e9e0).accept starting writer, state open >> 2016-01-26 00:47:16.531245 7f086f303700 10 -- 10.0.0.1:6806/49658 >> >> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=2 pgs=11 cs=1 l=1 >> c=0x1dc2e9e0).writer: state = open policy.server=1 > > Well, this is a normal session setup. If this is repeating or > something that might be interesting, but with this snippet I can't > tell. Similarly, the every-15-minutes socket replacement you've got > below is not interesting — that's a normal timeout/reset on idle > connections. > > You can look through the list archives for help on identifying why OSD > operations aren't working. Apparently you've got enough security cap > permissions, so it could be a network issue blocking messages of a > certain size. It could be some odd configuration issue I've not run > into. It could be some odd firewall behavior — especially if you're > really seeing that ceph-fuse works properly once you've already > established a mount on kernel rbd or something.. There's lots of > things that could have gone wrong, but it's not ceph-fuse itself and > given what you're showing me here, most of them aren't in any part of > Ceph. *shrug* > -Greg > > >> And thats it, nothing more moves. And the dd is stuck. >> >> [root@cn201 ~]# ceph daemon /var/run/ceph/ceph-client.admin.asok >> objecter_requests >> { >> "ops": [ >> { >> "tid": 13, >> "pg": "6.ec7cd164", >> "osd": 1, >> "object_id": "10000000bdf.00000008", >> "object_locator": "@6", >> "target_object_id": "10000000bdf.00000008", >> "target_object_locator": "@6", >> "paused": 0, >> "used_replica": 0, >> "precalc_pgid": 0, >> "last_sent": "2016-01-26 00:47:28.783581", >> "attempts": 1, >> "snapid": "head", >> "snap_context": "1=[]", >> "mtime": "2016-01-26 00:47:28.769145", >> "osd_ops": [ >> "write 0~4194304" >> ] >> }, >> { >> "tid": 23, >> "pg": "6.1d0c5182", >> "osd": 1, >> "object_id": "10000000bdf.00000012", >> "object_locator": "@6", >> "target_object_id": "10000000bdf.00000012", >> "target_object_locator": "@6", >> "paused": 0, >> "used_replica": 0, >> "precalc_pgid": 0, >> "last_sent": "2016-01-26 00:47:28.917678", >> "attempts": 1, >> "snapid": "head", >> "snap_context": "1=[]", >> "mtime": "2016-01-26 00:47:28.904582", >> "osd_ops": [ >> "write 0~4194304" >> ] >> }, >> { >> "tid": 25, >> "pg": "6.97271370", >> "osd": 2, >> "object_id": "10000000bdf.00000014", >> "object_locator": "@6", >> "target_object_id": "10000000bdf.00000014", >> "target_object_locator": "@6", >> "paused": 0, >> "used_replica": 0, >> "precalc_pgid": 0, >> "last_sent": "2016-01-26 00:47:28.943763", >> "attempts": 1, >> "snapid": "head", >> "snap_context": "1=[]", >> "mtime": "2016-01-26 00:47:28.929857", >> "osd_ops": [ >> "write 0~4194304" >> ] >> }, >> >> [....] >> >> >> On another osd's i can see: >> >> 2016-01-26 00:11:36.224232 7f9c18596700 0 -- 10.0.0.1:6804/51935 >> >> 10.0.0.91:0/1536 pipe(0x1f955800 sd=276 :6804 s=0 pgs=0 cs=0 l=1 >> c=0x20379080).accept replacing existing (lossy) channel (new one lossy=1) >> 2016-01-26 00:26:36.312611 7f9c1a2b3700 0 -- 10.0.0.1:6804/51935 >> >> 10.0.0.91:0/1536 pipe(0x1f792000 sd=246 :6804 s=0 pgs=0 cs=0 l=1 >> c=0x155d02c0).accept replacing existing (lossy) channel (new one lossy=1) >> 2016-01-26 00:41:36.403063 7f9c1795c700 0 -- 10.0.0.1:6804/51935 >> >> 10.0.0.91:0/1536 pipe(0x20efa000 sd=254 :6804 s=0 pgs=0 cs=0 l=1 >> c=0x21bc1760).accept replacing existing (lossy) channel (new one lossy=1) >> >> >> So for me, this is some kind of bug. >> >> Without loaded rbd/ceph kernel module ceph-fuse will not work. >> >> -- >> Mit freundlichen Gruessen / Best regards >> >> Oliver Dzombic >> IP-Interactive >> >> mailto:info@xxxxxxxxxxxxxxxxx >> >> Anschrift: >> >> IP Interactive UG ( haftungsbeschraenkt ) >> Zum Sonnenberg 1-3 >> 63571 Gelnhausen >> >> HRB 93402 beim Amtsgericht Hanau >> Geschäftsführung: Oliver Dzombic >> >> Steuer Nr.: 35 236 3622 1 >> UST ID: DE274086107 >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com