Re: ceph fuse closing stale session while still operable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



What is the output of the objecter_requests command? It really looks
to me like the writes aren't going out and you're backing up on
memory, but I can't tell without that. Actually, please grab a dump of
the perfcounters while you're at it, that will include info on dirty
memory and bytes written out.
-Greg

On Thu, Jan 21, 2016 at 3:01 PM, Oliver Dzombic <info@xxxxxxxxxxxxxxxxx> wrote:
> Hi Greg,
>
> while running the dd:
>
> server:
>
> [root@ceph2 ~]# ceph daemon /var/run/ceph/ceph-mds.ceph2.asok status
> {
>     "cluster_fsid": "<id>",
>     "whoami": 0,
>     "state": "up:active",
>     "mdsmap_epoch": 83,
>     "osdmap_epoch": 12592,
>     "osdmap_epoch_barrier": 12592
> }
>
>
> [root@ceph2 ~]# ceph daemon /var/run/ceph/ceph-mds.ceph2.asok
> objecter_requests
> {
>     "ops": [],
>     "linger_ops": [],
>     "pool_ops": [],
>     "pool_stat_ops": [],
>     "statfs_ops": [],
>     "command_ops": []
> }
>
> client:
>
> [root@cn201 ceph]# ceph --admin-daemon
> /var/run/ceph/ceph-client.admin.asok mds_requests
> {}
>
> [root@cn201 ceph]# ceph --admin-daemon
> /var/run/ceph/ceph-client.admin.asok mds_sessions
> {
>     "id": 21424165,
>     "sessions": [
>         {
>             "mds": 0,
>             "addr": "10.0.0.2:6806\/16568",
>             "seq": 13,
>             "cap_gen": 0,
>             "cap_ttl": "2016-01-21 23:31:39.091069",
>             "last_cap_renew_request": "2016-01-21 23:30:39.091069",
>             "cap_renew_seq": 24,
>             "num_caps": 3,
>             "state": "open"
>         }
>     ],
>     "mdsmap_epoch": 83
> }
>
>
> ~20 sec. later:
>
> [root@cn201 ceph]# ceph --admin-daemon ceph-client.admin.asok mds_sessions
> {
>     "id": 21424165,
>     "sessions": [
>         {
>             "mds": 0,
>             "addr": "10.0.0.2:6806\/16568",
>             "seq": 13,
>             "cap_gen": 0,
>             "cap_ttl": "2016-01-21 23:32:59.103092",
>             "last_cap_renew_request": "2016-01-21 23:31:59.103092",
>             "cap_renew_seq": 28,
>             "num_caps": 3,
>             "state": "open"
>         }
>     ],
>     "mdsmap_epoch": 83
> }
>
> In the meanwhile:
>
> [root@ceph2 ~]# tail -f /var/log/ceph/ceph-mds.ceph2.log
>
> 2016-01-21 23:24:19.230373 7f71b9459700  1 mds.0.11 asok_command: status
> (starting...)
> 2016-01-21 23:24:19.230432 7f71b9459700  1 mds.0.11 asok_command: status
> (complete)
> 2016-01-21 23:25:11.010865 7f71b9459700  1 mds.0.11 asok_command: status
> (starting...)
> 2016-01-21 23:25:11.010895 7f71b9459700  1 mds.0.11 asok_command: status
> (complete)
> 2016-01-21 23:31:26.954395 7f71b9459700  1 mds.0.11 asok_command: status
> (starting...)
> 2016-01-21 23:31:26.954425 7f71b9459700  1 mds.0.11 asok_command: status
> (complete)
>
>
> ---------------------
>
> Whole testing looks like:
>
> [root@cn201 ceph]# dd if=/dev/zero bs=64k count=10 of=/ceph-storage/dd1
> 10+0 records in
> 10+0 records out
> 655360 bytes (655 kB) copied, 1.97532 s, 332 kB/s
> [root@cn201 ceph]# dd if=/dev/zero bs=64k count=100 of=/ceph-storage/dd1
> 100+0 records in
> 100+0 records out
> 6553600 bytes (6.6 MB) copied, 1.15482 s, 5.7 MB/s
> [root@cn201 ceph]# dd if=/dev/zero bs=64k count=1000 of=/ceph-storage/dd1
> 1000+0 records in
> 1000+0 records out
> 65536000 bytes (66 MB) copied, 0.798252 s, 82.1 MB/s
> [root@cn201 ceph]# dd if=/dev/zero bs=64k count=1000 of=/ceph-storage/dd1
> 1000+0 records in
> 1000+0 records out
> 65536000 bytes (66 MB) copied, 5.15479 s, 12.7 MB/s
> [root@cn201 ceph]# dd if=/dev/zero bs=8k count=10000 of=/ceph-storage/dd1
> 10000+0 records in
> 10000+0 records out
> 81920000 bytes (82 MB) copied, 4.46012 s, 18.4 MB/s
>
> -> all fine.
>
> [root@cn201 ceph]# dd if=/dev/zero bs=8k count=20000 of=/ceph-storage/dd1
>
>
> -> > 100 MB == dead
>
> [root@cn201 ceph-storage]# ls -la dd1
> -rw-r--r-- 1 root root 104849408 Jan 21 23:42 dd1
>
> ----------
>
> While its dead, new small write, which usually work, does not work anymore:
>
> [root@cn201 ceph-storage]# dd if=/dev/zero bs=64k count=10
> of=/ceph-storage/dd3
>
>
> -> dead
>
> [root@cn201 ceph-storage]# ls -la dd3
> -rw-r--r-- 1 root root 0 Jan 21 23:44 dd3
>
>
> While both dd's hanging around in nightmares of inconsistency:
>
> [root@cn201 ceph-storage]# ceph --admin-daemon
> /var/run/ceph/ceph-client.admin.asok mds_requests
> {}
>
>
>
> [root@cn201 ceph-storage]# ceph --admin-daemon
> /var/run/ceph/ceph-client.admin.asok mds_sessions
> {
>     "id": 21424169,
>     "sessions": [
>         {
>             "mds": 0,
>             "addr": "10.0.0.2:6806\/16568",
>             "seq": 98,
>             "cap_gen": 0,
>             "cap_ttl": "2016-01-21 23:48:12.879459",
>             "last_cap_renew_request": "2016-01-21 23:47:12.879459",
>             "cap_renew_seq": 25,
>             "num_caps": 4,
>             "state": "open"
>         }
>     ],
>     "mdsmap_epoch": 83
> }
>
>
> ------
>
> Further testing:
>
> [root@cn201 ceph-storage]# touch test
> [root@cn201 ceph-storage]# ls
> dd1  dd2  dd3  test
>
>
> [root@cn201 ceph-storage]# echo fu_ma_schu > test
>
> -> hangs for 2 seconds but then it works:
>
> [root@cn201 ceph-storage]# cat test
> fu_ma_schu
>
>
> [root@cn201 ceph-storage]# ls -la
> total 204669
> drwxr-xr-x   1 root root  67108864 Jan 21 23:48 .
> dr-xr-xr-x. 23 root root      4096 Jan 21 23:36 ..
> -rw-r--r--   1 root root 104849408 Jan 21 23:42 dd1
> -rw-r--r--   1 root root 104726528 Jan 21 23:23 dd2
> -rw-r--r--   1 root root         0 Jan 21 23:44 dd3
> -rw-r--r--   1 root root        11 Jan 21 23:49 test
>
> -----
>
> Running this script:
>
> #!/bin/bash
>
> while [ 1 > 0 ]
> do
> echo "fu_man_schuhuuu" >> test
> done
>
> checking via ls -la the status:
>
> [...]
>
> [root@cn201 ceph-storage]# ls -la
> total 204938
> drwxr-xr-x   1 root root  67108864 Jan 21 23:56 .
> dr-xr-xr-x. 23 root root      4096 Jan 21 23:36 ..
> -rw-r--r--   1 root root         0 Jan 21 23:58 0
> -rw-r--r--   1 root root 104849408 Jan 21 23:42 dd1
> -rw-r--r--   1 root root 104726528 Jan 21 23:23 dd2
> -rw-r--r--   1 root root         0 Jan 21 23:44 dd3
> -rw-r--r--   1 root root    275851 Jan 21 23:58 test
>
> [root@cn201 ceph-storage]# ls -la
> total 204956
> drwxr-xr-x   1 root root  67108864 Jan 21 23:56 .
> dr-xr-xr-x. 23 root root      4096 Jan 21 23:36 ..
> -rw-r--r--   1 root root         0 Jan 21 23:58 0
> -rw-r--r--   1 root root 104849408 Jan 21 23:42 dd1
> -rw-r--r--   1 root root 104726528 Jan 21 23:23 dd2
> -rw-r--r--   1 root root         0 Jan 21 23:44 dd3
> -rw-r--r--   1 root root    294171 Jan 21 23:58 test
>
> -> stuck no further, at 294171 bytes it stops and hangs, like the dd's
>
>
> Now checking again:
>
> This time, its different on the client, while on ceph server its the same:
>
> [root@cn201 ceph-storage]# ceph --admin-daemon
> /var/run/ceph/ceph-client.admin.asok mds_requests
> {
>     "request": {
>         "tid": 18409,
>         "op": "open",
>         "path": "#100000007f5",
>         "path2": "",
>         "ino": "100000007f5",
>         "hint_ino": "0",
>         "sent_stamp": "2016-01-21 23:58:11.806196",
>         "mds": 0,
>         "resend_mds": -1,
>         "send_to_auth": 0,
>         "sent_on_mseq": 0,
>         "retry_attempt": 1,
>         "got_unsafe": 0,
>         "uid": 0,
>         "gid": 0,
>         "oldest_client_tid": 17462,
>         "mdsmap_epoch": 0,
>         "flags": 0,
>         "num_retry": 0,
>         "num_fwd": 0,
>         "num_releases": 0
>     }
> }
>
> [root@cn201 ceph-storage]# ceph --admin-daemon
> /var/run/ceph/ceph-client.admin.asok mds_sessions
> {
>     "id": 21424169,
>     "sessions": [
>         {
>             "mds": 0,
>             "addr": "10.0.0.2:6806\/16568",
>             "seq": 55263,
>             "cap_gen": 0,
>             "cap_ttl": "2016-01-21 23:58:52.978997",
>             "last_cap_renew_request": "2016-01-22 00:00:33.003126",
>             "cap_renew_seq": 66,
>             "num_caps": 6,
>             "state": "open"
>         }
>     ],
>     "mdsmap_epoch": 83
> }
>
>
>
> ----------------
>
> Any idea what movie is running around there ?
>
> --
> Mit freundlichen Gruessen / Best regards
>
> Oliver Dzombic
> IP-Interactive
>
> mailto:info@xxxxxxxxxxxxxxxxx
>
> Anschrift:
>
> IP Interactive UG ( haftungsbeschraenkt )
> Zum Sonnenberg 1-3
> 63571 Gelnhausen
>
> HRB 93402 beim Amtsgericht Hanau
> Geschäftsführung: Oliver Dzombic
>
> Steuer Nr.: 35 236 3622 1
> UST ID: DE274086107
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux