On Sat, Aug 27, 2016 at 3:01 AM, Francois Lafont <francois.lafont.1978@xxxxxxxxx> wrote: > Hi, > > I had exactly the same error in my production ceph client node with > Jewel 10.2.1 in my case. > > In the client node : > - Ubuntu 14.04 > - kernel 3.13.0-92-generic > - ceph 10.2.1 (3a66dd4f30852819c1bdaa8ec23c795d4ad77269) > - cephfs via _ceph-fuse_ > > In the cluster node : > - Ubuntu 14.04 > - kernel 3.13.0-92-generic > - ceph 10.2.1 (3a66dd4f30852819c1bdaa8ec23c795d4ad77269) > > It was during the execution of a very basic Python (2.7.6) script which > makes some os.makedirs(...) and os.chown(...). > > Just in case, the logs are below. I'm sorry they are not verbose at all > and so probably useless for you... > > Which settings should I put in my client and cluster configuration to > have relevant logs if the same error happens again? > > Regards. > François Lafont > > Here are the logs: > > 1. In the client node: http://francois-lafont.ac-versailles.fr/misc/ceph-client.cephfs.log.1.gz Ha, yep, that's one of the bugs Giancolo found: ceph version 10.2.1 (3a66dd4f30852819c1bdaa8ec23c795d4ad77269) 1: (()+0x299152) [0x7f91398dc152] 2: (()+0x10330) [0x7f9138bbb330] 3: (Client::get_root_ino()+0x10) [0x7f91397df6c0] 4: (CephFuse::Handle::make_fake_ino(inodeno_t, snapid_t)+0x175) [0x7f91397dd3d5] 5: (()+0x19ac09) [0x7f91397ddc09] 6: (()+0x14b45) [0x7f91391f7b45] 7: (()+0x1522b) [0x7f91391f822b] 8: (()+0x11e49) [0x7f91391f4e49] 9: (()+0x8184) [0x7f9138bb3184] 10: (clone()+0x6d) [0x7f913752237d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. So I that'll be in the next Jewel release if it's not already fixed in 10.2.2. -Greg > > 2. In the (active) mds node: > > ----%<----%<----%<----%<----%<----%<----%<----%<---- > ~$ sudo zcat /var/log/ceph/ceph-mds.ceph02.log.1.gz > 2016-08-22 15:02:03.799037 7f3f9adc1700 0 -- 10.0.2.102:6800/2186 >> 192.168.23.11:0/431481110 pipe(0x7f3fb3a87400 sd=22 :6800 s=2 pgs=64 cs=1 l=0 c=0x7f3fb5f10900).fault with nothing to send, going to standby > 2016-08-22 15:02:40.236001 7f3f9f7d3700 0 log_channel(cluster) log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 34.503993 secs > 2016-08-22 15:02:40.236026 7f3f9f7d3700 0 log_channel(cluster) log [WRN] : slow request 34.503993 seconds old, received at 2016-08-22 15:02:05.731897: client_request(client.1442720:650326 getattr pAsLsXsFs #1000001b6d0 2016-08-22 15:02:05.731515) currently failed to rdlock, waiting > 2016-08-22 15:07:00.245269 7f3f9f7d3700 0 log_channel(cluster) log [INF] : closing stale session client.1433176 192.168.23.11:0/431481110 after 304.132797 > 2016-08-22 15:23:07.970215 7f3f9adc1700 0 -- 10.0.2.102:6800/2186 >> 192.168.23.11:0/2607326748 pipe(0x7f3fff365400 sd=22 :6800 s=2 pgs=8 cs=1 l=0 c=0x7f3fb5f10a80).fault, server, going to standby > 2016-08-22 15:28:05.281489 7f3f9f7d3700 0 log_channel(cluster) log [INF] : closing stale session client.1537178 192.168.23.11:0/2607326748 after 300.588323 > ----%<----%<----%<----%<----%<----%<----%<----%<---- > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com