Re: ceph fuse closing stale session while still operable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Greg,

thank you for your suggestions.

Just let me clearify one little thing:

It starts working as soon as i load the kernel rbd/ceph module.

I do not need to establish any connection based on that modules to the
ceph cluster.

Just loading the kernel modules with modprobe (before mounting with
ceph-fuse) is fully enough to make it work.

A native cephfs mount is working out of the box ( which is clear since i
need the ceph/rbd kernel module to mount it ).

And thats why i dont think its any kind of network/firewall/similar issue.

----

Anyway, i will try to find a way to get more informations out of the
cluster.

Thank you so far for your time !

-- 
Mit freundlichen Gruessen / Best regards

Oliver Dzombic
IP-Interactive

mailto:info@xxxxxxxxxxxxxxxxx

Anschrift:

IP Interactive UG ( haftungsbeschraenkt )
Zum Sonnenberg 1-3
63571 Gelnhausen

HRB 93402 beim Amtsgericht Hanau
Geschäftsführung: Oliver Dzombic

Steuer Nr.: 35 236 3622 1
UST ID: DE274086107


Am 26.01.2016 um 01:10 schrieb Gregory Farnum:
> On Mon, Jan 25, 2016 at 3:58 PM, Oliver Dzombic <info@xxxxxxxxxxxxxxxxx> wrote:
>> Hi,
>>
>> i switched now debugging to ms = 10
>>
>> when starting the dd i can see in the logs of osd:
>>
>> 2016-01-26 00:47:16.530046 7f086f404700  1 -- 10.0.0.1:6806/49658 >> :/0
>> pipe(0x1f830000 sd=292 :6806 s=0 pgs=0 cs=0 l=0 c=0x1dc2e9e0).accept
>> sd=292 10.0.0.91:56814/0
>> 2016-01-26 00:47:16.530591 7f086f404700 10 -- 10.0.0.1:6806/49658 >>
>> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=0 pgs=0 cs=0 l=0
>> c=0x1dc2e9e0).accept peer addr is 10.0.0.91:0/1532
>> 2016-01-26 00:47:16.530709 7f086f404700 10 -- 10.0.0.1:6806/49658 >>
>> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=0 pgs=0 cs=0 l=1
>> c=0x1dc2e9e0).accept of host_type 8, policy.lossy=1 policy.server=1
>> policy.standby=0 policy.resetcheck=0
>> 2016-01-26 00:47:16.530724 7f086f404700 10 -- 10.0.0.1:6806/49658 >>
>> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=0 pgs=0 cs=0 l=1
>> c=0x1dc2e9e0).accept my proto 24, their proto 24
>> 2016-01-26 00:47:16.530864 7f086f404700 10 -- 10.0.0.1:6806/49658 >>
>> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=0 pgs=0 cs=0 l=1
>> c=0x1dc2e9e0).accept:  setting up session_security.
>> 2016-01-26 00:47:16.530877 7f086f404700 10 -- 10.0.0.1:6806/49658 >>
>> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=0 pgs=0 cs=0 l=1
>> c=0x1dc2e9e0).accept new session
>> 2016-01-26 00:47:16.530884 7f086f404700 10 -- 10.0.0.1:6806/49658 >>
>> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=2 pgs=11 cs=1 l=1
>> c=0x1dc2e9e0).accept success, connect_seq = 1, sending READY
>> 2016-01-26 00:47:16.530889 7f086f404700 10 -- 10.0.0.1:6806/49658 >>
>> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=2 pgs=11 cs=1 l=1
>> c=0x1dc2e9e0).accept features 37154696925806591
>> 2016-01-26 00:47:16.530912 7f086f404700 10 -- 10.0.0.1:6806/49658 >>
>> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=2 pgs=11 cs=1 l=1
>> c=0x1dc2e9e0).register_pipe
>> 2016-01-26 00:47:16.530935 7f086f404700 10 -- 10.0.0.1:6806/49658 >>
>> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=2 pgs=11 cs=1 l=1
>> c=0x1dc2e9e0).discard_requeued_up_to 0
>> 2016-01-26 00:47:16.530964 7f086f404700 10 -- 10.0.0.1:6806/49658 >>
>> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=2 pgs=11 cs=1 l=1
>> c=0x1dc2e9e0).accept starting writer, state open
>> 2016-01-26 00:47:16.531245 7f086f303700 10 -- 10.0.0.1:6806/49658 >>
>> 10.0.0.91:0/1532 pipe(0x1f830000 sd=292 :6806 s=2 pgs=11 cs=1 l=1
>> c=0x1dc2e9e0).writer: state = open policy.server=1
> 
> Well, this is a normal session setup. If this is repeating or
> something that might be interesting, but with this snippet I can't
> tell. Similarly, the every-15-minutes socket replacement you've got
> below is not interesting — that's a normal timeout/reset on idle
> connections.
> 
> You can look through the list archives for help on identifying why OSD
> operations aren't working. Apparently you've got enough security cap
> permissions, so it could be a network issue blocking messages of a
> certain size. It could be some odd configuration issue I've not run
> into. It could be some odd firewall behavior — especially if you're
> really seeing that ceph-fuse works properly once you've already
> established a mount on kernel rbd or something.. There's lots of
> things that could have gone wrong, but it's not ceph-fuse itself and
> given what you're showing me here, most of them aren't in any part of
> Ceph. *shrug*
> -Greg
> 
> 
>> And thats it, nothing more moves. And the dd is stuck.
>>
>> [root@cn201 ~]# ceph daemon /var/run/ceph/ceph-client.admin.asok
>> objecter_requests
>> {
>>     "ops": [
>>         {
>>             "tid": 13,
>>             "pg": "6.ec7cd164",
>>             "osd": 1,
>>             "object_id": "10000000bdf.00000008",
>>             "object_locator": "@6",
>>             "target_object_id": "10000000bdf.00000008",
>>             "target_object_locator": "@6",
>>             "paused": 0,
>>             "used_replica": 0,
>>             "precalc_pgid": 0,
>>             "last_sent": "2016-01-26 00:47:28.783581",
>>             "attempts": 1,
>>             "snapid": "head",
>>             "snap_context": "1=[]",
>>             "mtime": "2016-01-26 00:47:28.769145",
>>             "osd_ops": [
>>                 "write 0~4194304"
>>             ]
>>         },
>>         {
>>             "tid": 23,
>>             "pg": "6.1d0c5182",
>>             "osd": 1,
>>             "object_id": "10000000bdf.00000012",
>>             "object_locator": "@6",
>>             "target_object_id": "10000000bdf.00000012",
>>             "target_object_locator": "@6",
>>             "paused": 0,
>>             "used_replica": 0,
>>             "precalc_pgid": 0,
>>             "last_sent": "2016-01-26 00:47:28.917678",
>>             "attempts": 1,
>>             "snapid": "head",
>>             "snap_context": "1=[]",
>>             "mtime": "2016-01-26 00:47:28.904582",
>>             "osd_ops": [
>>                 "write 0~4194304"
>>             ]
>>         },
>>         {
>>             "tid": 25,
>>             "pg": "6.97271370",
>>             "osd": 2,
>>             "object_id": "10000000bdf.00000014",
>>             "object_locator": "@6",
>>             "target_object_id": "10000000bdf.00000014",
>>             "target_object_locator": "@6",
>>             "paused": 0,
>>             "used_replica": 0,
>>             "precalc_pgid": 0,
>>             "last_sent": "2016-01-26 00:47:28.943763",
>>             "attempts": 1,
>>             "snapid": "head",
>>             "snap_context": "1=[]",
>>             "mtime": "2016-01-26 00:47:28.929857",
>>             "osd_ops": [
>>                 "write 0~4194304"
>>             ]
>>         },
>>
>> [....]
>>
>>
>> On another osd's i can see:
>>
>> 2016-01-26 00:11:36.224232 7f9c18596700  0 -- 10.0.0.1:6804/51935 >>
>> 10.0.0.91:0/1536 pipe(0x1f955800 sd=276 :6804 s=0 pgs=0 cs=0 l=1
>> c=0x20379080).accept replacing existing (lossy) channel (new one lossy=1)
>> 2016-01-26 00:26:36.312611 7f9c1a2b3700  0 -- 10.0.0.1:6804/51935 >>
>> 10.0.0.91:0/1536 pipe(0x1f792000 sd=246 :6804 s=0 pgs=0 cs=0 l=1
>> c=0x155d02c0).accept replacing existing (lossy) channel (new one lossy=1)
>> 2016-01-26 00:41:36.403063 7f9c1795c700  0 -- 10.0.0.1:6804/51935 >>
>> 10.0.0.91:0/1536 pipe(0x20efa000 sd=254 :6804 s=0 pgs=0 cs=0 l=1
>> c=0x21bc1760).accept replacing existing (lossy) channel (new one lossy=1)
>>
>>
>> So for me, this is some kind of bug.
>>
>> Without loaded rbd/ceph kernel module ceph-fuse will not work.
>>
>> --
>> Mit freundlichen Gruessen / Best regards
>>
>> Oliver Dzombic
>> IP-Interactive
>>
>> mailto:info@xxxxxxxxxxxxxxxxx
>>
>> Anschrift:
>>
>> IP Interactive UG ( haftungsbeschraenkt )
>> Zum Sonnenberg 1-3
>> 63571 Gelnhausen
>>
>> HRB 93402 beim Amtsgericht Hanau
>> Geschäftsführung: Oliver Dzombic
>>
>> Steuer Nr.: 35 236 3622 1
>> UST ID: DE274086107
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux