Re: MDS closing stale session

谷枫 <feicheche@xxxxxxxxx> · Fri, 5 Jun 2015 23:02:20 +0800

Sorry to send a warong log with the apport. because i met the same problem twice  today.This is the right time apport log .
ERROR: apport (pid 7601) Fri Jun  5 09:58:45 2015: called for pid 18748, signal 6, core limit 0
ERROR: apport (pid 7601) Fri Jun  5 09:58:45 2015: executable: /usr/bin/ceph-fuse (command line "ceph-fuse -k /etc/ceph/ceph.client.admin.keyring -m rain01,rain02,rain03 /grdata")
ERROR: apport (pid 7601) Fri Jun  5 09:58:45 2015: is_closing_session(): no DBUS_SESSION_BUS_ADDRESS in environment
ERROR: apport (pid 7601) Fri Jun  5 09:59:25 2015: wrote report /var/crash/_usr_bin_ceph-fuse.0.crash
ERROR: apport (pid 7733) Fri Jun  5 09:59:25 2015: pid 7578 crashed in a container

At the time, the osd node has error log too.

2015-06-05 09:58:44.809822 7fac0de07700  0 -- 10.3.1.5:68**/**** >> **.3.1.4:0/18748 pipe(0x10a8e000 sd=47 :6801 s=2 pgs=1253 cs=1 l=1 c=0x1112b860).reader bad tag 116
2015-06-05 09:58:44.812270 7fac0de07700  0 -- 10.3.1.5:68**/**** >> **.3.1.4:0/18748 pipe(0x146e3000 sd=47 :6801 s=2 pgs=1359 cs=1 l=1 c=0x15310ec0).reader bad tag 32

2015-06-05 22:41 GMT+08:00 谷枫 <feicheche@xxxxxxxxx>:
sorry i send this mail careless, continueThe mds error is :
2015-06-05 09:59:25.012130 7fa1ed118700  0 -- 10.3.1.5:6800/1365 >> 10.3.1.4:0/18748 pipe(0x5f81000 sd=22 :6800 s=2 pgs=1252 cs=1 l=0 c=0x4f935a0).fault with nothing to send, going to standby
2015-06-05 10:03:40.767822 7fa1f0a27700  0 log_channel(cluster) log [INF] : closing stale session client.24153 10.3.1.4:0/18748 after 300.071624

The apport error is :

ERROR: apport (pid 30184) Fri Jun  5 12:13:03 2015: called for pid 6331, signal 11, core limit 0
ERROR: apport (pid 30184) Fri Jun  5 12:13:03 2015: executable: /usr/bin/ceph-fuse (command line "ceph-fuse -k /etc/ceph/ceph.client.admin.keyring -m node01,node02,node03 /grdata")
ERROR: apport (pid 30184) Fri Jun  5 12:13:03 2015: is_closing_session(): no DBUS_SESSION_BUS_ADDRESS in environment
ERROR: apport (pid 30184) Fri Jun  5 12:13:33 2015: wrote report /var/crash/_usr_bin_ceph-fuse.0.crash

But the ceph-s is OK
    cluster add8fa43-9f84-4b5d-df32-095e3421a228
     health HEALTH_OK
     monmap e2: 3 mons at {node01=10.3.1.2:6789/0,node02=10.3.1.3:6789/0,node03=10.3.1.4:6789/0}
            election epoch 44, quorum 0,1,2 node01,node02,node03
     mdsmap e37: 1/1/1 up {0=osd01=up:active}
     osdmap e526: 5 osds: 5 up, 5 in
      pgmap v392315: 264 pgs, 3 pools, 26953 MB data, 106 kobjects
            81519 MB used, 1036 GB / 1115 GB avail
                 264 active+clean
  client io 10171 B/s wr, 1 op/s

When i remount the ceph-partition, it's get nomal.

I want to know is this a ceph bug ? or the ceph-fuse tool's bug?
Should i change the mount type with the mount -t ceph ?

2015-06-05 22:34 GMT+08:00 谷枫 <feicheche@xxxxxxxxx>:
Hi everyone,I hava a five nodes ceph cluster with cephfs.Mount the ceph partition with ceph-fuse tools.
I met a serious problem has no omens.
One of the node the ceph-fuse procs down and the ceph partition that mounted with the ceph-fuse tools change to unavailable.
ls the ceph partition, it's like this:
d?????????   ? ?    ?       ?            ? ceph-data/

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com