Re: ceph fuse closing stale session while still operable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Greg,

while running the dd:

server:

[root@ceph2 ~]# ceph daemon /var/run/ceph/ceph-mds.ceph2.asok status
{
    "cluster_fsid": "<id>",
    "whoami": 0,
    "state": "up:active",
    "mdsmap_epoch": 83,
    "osdmap_epoch": 12592,
    "osdmap_epoch_barrier": 12592
}


[root@ceph2 ~]# ceph daemon /var/run/ceph/ceph-mds.ceph2.asok
objecter_requests
{
    "ops": [],
    "linger_ops": [],
    "pool_ops": [],
    "pool_stat_ops": [],
    "statfs_ops": [],
    "command_ops": []
}

client:

[root@cn201 ceph]# ceph --admin-daemon
/var/run/ceph/ceph-client.admin.asok mds_requests
{}

[root@cn201 ceph]# ceph --admin-daemon
/var/run/ceph/ceph-client.admin.asok mds_sessions
{
    "id": 21424165,
    "sessions": [
        {
            "mds": 0,
            "addr": "10.0.0.2:6806\/16568",
            "seq": 13,
            "cap_gen": 0,
            "cap_ttl": "2016-01-21 23:31:39.091069",
            "last_cap_renew_request": "2016-01-21 23:30:39.091069",
            "cap_renew_seq": 24,
            "num_caps": 3,
            "state": "open"
        }
    ],
    "mdsmap_epoch": 83
}


~20 sec. later:

[root@cn201 ceph]# ceph --admin-daemon ceph-client.admin.asok mds_sessions
{
    "id": 21424165,
    "sessions": [
        {
            "mds": 0,
            "addr": "10.0.0.2:6806\/16568",
            "seq": 13,
            "cap_gen": 0,
            "cap_ttl": "2016-01-21 23:32:59.103092",
            "last_cap_renew_request": "2016-01-21 23:31:59.103092",
            "cap_renew_seq": 28,
            "num_caps": 3,
            "state": "open"
        }
    ],
    "mdsmap_epoch": 83
}

In the meanwhile:

[root@ceph2 ~]# tail -f /var/log/ceph/ceph-mds.ceph2.log

2016-01-21 23:24:19.230373 7f71b9459700  1 mds.0.11 asok_command: status
(starting...)
2016-01-21 23:24:19.230432 7f71b9459700  1 mds.0.11 asok_command: status
(complete)
2016-01-21 23:25:11.010865 7f71b9459700  1 mds.0.11 asok_command: status
(starting...)
2016-01-21 23:25:11.010895 7f71b9459700  1 mds.0.11 asok_command: status
(complete)
2016-01-21 23:31:26.954395 7f71b9459700  1 mds.0.11 asok_command: status
(starting...)
2016-01-21 23:31:26.954425 7f71b9459700  1 mds.0.11 asok_command: status
(complete)


---------------------

Whole testing looks like:

[root@cn201 ceph]# dd if=/dev/zero bs=64k count=10 of=/ceph-storage/dd1
10+0 records in
10+0 records out
655360 bytes (655 kB) copied, 1.97532 s, 332 kB/s
[root@cn201 ceph]# dd if=/dev/zero bs=64k count=100 of=/ceph-storage/dd1
100+0 records in
100+0 records out
6553600 bytes (6.6 MB) copied, 1.15482 s, 5.7 MB/s
[root@cn201 ceph]# dd if=/dev/zero bs=64k count=1000 of=/ceph-storage/dd1
1000+0 records in
1000+0 records out
65536000 bytes (66 MB) copied, 0.798252 s, 82.1 MB/s
[root@cn201 ceph]# dd if=/dev/zero bs=64k count=1000 of=/ceph-storage/dd1
1000+0 records in
1000+0 records out
65536000 bytes (66 MB) copied, 5.15479 s, 12.7 MB/s
[root@cn201 ceph]# dd if=/dev/zero bs=8k count=10000 of=/ceph-storage/dd1
10000+0 records in
10000+0 records out
81920000 bytes (82 MB) copied, 4.46012 s, 18.4 MB/s

-> all fine.

[root@cn201 ceph]# dd if=/dev/zero bs=8k count=20000 of=/ceph-storage/dd1


-> > 100 MB == dead

[root@cn201 ceph-storage]# ls -la dd1
-rw-r--r-- 1 root root 104849408 Jan 21 23:42 dd1

----------

While its dead, new small write, which usually work, does not work anymore:

[root@cn201 ceph-storage]# dd if=/dev/zero bs=64k count=10
of=/ceph-storage/dd3


-> dead

[root@cn201 ceph-storage]# ls -la dd3
-rw-r--r-- 1 root root 0 Jan 21 23:44 dd3


While both dd's hanging around in nightmares of inconsistency:

[root@cn201 ceph-storage]# ceph --admin-daemon
/var/run/ceph/ceph-client.admin.asok mds_requests
{}



[root@cn201 ceph-storage]# ceph --admin-daemon
/var/run/ceph/ceph-client.admin.asok mds_sessions
{
    "id": 21424169,
    "sessions": [
        {
            "mds": 0,
            "addr": "10.0.0.2:6806\/16568",
            "seq": 98,
            "cap_gen": 0,
            "cap_ttl": "2016-01-21 23:48:12.879459",
            "last_cap_renew_request": "2016-01-21 23:47:12.879459",
            "cap_renew_seq": 25,
            "num_caps": 4,
            "state": "open"
        }
    ],
    "mdsmap_epoch": 83
}


------

Further testing:

[root@cn201 ceph-storage]# touch test
[root@cn201 ceph-storage]# ls
dd1  dd2  dd3  test


[root@cn201 ceph-storage]# echo fu_ma_schu > test

-> hangs for 2 seconds but then it works:

[root@cn201 ceph-storage]# cat test
fu_ma_schu


[root@cn201 ceph-storage]# ls -la
total 204669
drwxr-xr-x   1 root root  67108864 Jan 21 23:48 .
dr-xr-xr-x. 23 root root      4096 Jan 21 23:36 ..
-rw-r--r--   1 root root 104849408 Jan 21 23:42 dd1
-rw-r--r--   1 root root 104726528 Jan 21 23:23 dd2
-rw-r--r--   1 root root         0 Jan 21 23:44 dd3
-rw-r--r--   1 root root        11 Jan 21 23:49 test

-----

Running this script:

#!/bin/bash

while [ 1 > 0 ]
do
echo "fu_man_schuhuuu" >> test
done

checking via ls -la the status:

[...]

[root@cn201 ceph-storage]# ls -la
total 204938
drwxr-xr-x   1 root root  67108864 Jan 21 23:56 .
dr-xr-xr-x. 23 root root      4096 Jan 21 23:36 ..
-rw-r--r--   1 root root         0 Jan 21 23:58 0
-rw-r--r--   1 root root 104849408 Jan 21 23:42 dd1
-rw-r--r--   1 root root 104726528 Jan 21 23:23 dd2
-rw-r--r--   1 root root         0 Jan 21 23:44 dd3
-rw-r--r--   1 root root    275851 Jan 21 23:58 test

[root@cn201 ceph-storage]# ls -la
total 204956
drwxr-xr-x   1 root root  67108864 Jan 21 23:56 .
dr-xr-xr-x. 23 root root      4096 Jan 21 23:36 ..
-rw-r--r--   1 root root         0 Jan 21 23:58 0
-rw-r--r--   1 root root 104849408 Jan 21 23:42 dd1
-rw-r--r--   1 root root 104726528 Jan 21 23:23 dd2
-rw-r--r--   1 root root         0 Jan 21 23:44 dd3
-rw-r--r--   1 root root    294171 Jan 21 23:58 test

-> stuck no further, at 294171 bytes it stops and hangs, like the dd's


Now checking again:

This time, its different on the client, while on ceph server its the same:

[root@cn201 ceph-storage]# ceph --admin-daemon
/var/run/ceph/ceph-client.admin.asok mds_requests
{
    "request": {
        "tid": 18409,
        "op": "open",
        "path": "#100000007f5",
        "path2": "",
        "ino": "100000007f5",
        "hint_ino": "0",
        "sent_stamp": "2016-01-21 23:58:11.806196",
        "mds": 0,
        "resend_mds": -1,
        "send_to_auth": 0,
        "sent_on_mseq": 0,
        "retry_attempt": 1,
        "got_unsafe": 0,
        "uid": 0,
        "gid": 0,
        "oldest_client_tid": 17462,
        "mdsmap_epoch": 0,
        "flags": 0,
        "num_retry": 0,
        "num_fwd": 0,
        "num_releases": 0
    }
}

[root@cn201 ceph-storage]# ceph --admin-daemon
/var/run/ceph/ceph-client.admin.asok mds_sessions
{
    "id": 21424169,
    "sessions": [
        {
            "mds": 0,
            "addr": "10.0.0.2:6806\/16568",
            "seq": 55263,
            "cap_gen": 0,
            "cap_ttl": "2016-01-21 23:58:52.978997",
            "last_cap_renew_request": "2016-01-22 00:00:33.003126",
            "cap_renew_seq": 66,
            "num_caps": 6,
            "state": "open"
        }
    ],
    "mdsmap_epoch": 83
}



----------------

Any idea what movie is running around there ?

-- 
Mit freundlichen Gruessen / Best regards

Oliver Dzombic
IP-Interactive

mailto:info@xxxxxxxxxxxxxxxxx

Anschrift:

IP Interactive UG ( haftungsbeschraenkt )
Zum Sonnenberg 1-3
63571 Gelnhausen

HRB 93402 beim Amtsgericht Hanau
Geschäftsführung: Oliver Dzombic

Steuer Nr.: 35 236 3622 1
UST ID: DE274086107
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux