Dear Cephers, For some time now I am running a small Ceph cluster made of 4OSD + 1MON Servers, and evaluating possible Ceph usages in our storage infrastructure. Until few weeks ago I was running Hammer release, using mostly RBD Clients mounting replicated pool images. Everything was running stable. Recently I updated to Jewel v10.2.2 (actually did a clean install). Last week I tested Ceph Tiering capabilities, backed by an erasure Coded Pools. Everything was running just fine until one moment when I couldn't mount my images anymore. I got following error when mounting rbd: sudo rbd map test rbd: sysfs write failed In some cases useful info is found in syslog - try "dmesg | tail" or so. rbd: map failed: (5) Input/output error Error is backed up by dumping a several MB of hex osdmap in /var/log/syslog and /var/log/kern.log, filling several GB of logs in a short time (dump attached) I thought I must have done something wrong, while I did a lot of testing and in the process I recreated lots of pools, and shuffled a lot of data because of testing various crushset combinations. So I began from start, installing cluster again. Everything was working for some time, but the error occurred again (also for normal replicated pool) Ceph Servers and clients are running Ubuntu 14.04 (kernel 3.19) Cluster state is HEALTH_OK RBD Images have new features disabled: rbd -p ecpool feature disable test exclusive-lock, object-map, fast-diff, deep-flatten Does anyone have some tips here, has similar already happened to anyone? Should I just go on and do a v4.4 kernel update? Thank you, Ivan
Attachment:
ceph_osd_dump
Description: Binary data
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com