If you dig into the list archives I think somebody else went through this when the issue was discovered and recovered successfully. But I don't know the details. :) -Greg On Thu, Apr 9, 2015 at 3:38 PM, Dirk Grunwald <Dirk.Grunwald@xxxxxxxxxxxx> wrote: > Aha. That would have been useful to see -- I saw the notice about 0.93, but > not that. > > when I roll back to v0.92, I get a different error (see below) > > This doesn't seem very happy - any suggestions? > > > root@zfs2:~/XYZZY/v92# ceph-osd -d -i 4 --flush-journal > 2015-04-09 16:31:44.756113 7f987f822900 0 ceph version 0.92 > (00a3ac3b67d93860e7f0b6e07319f11b14d0fec0), process ceph-osd, pid 12605 > 2015-04-09 16:31:44.758743 7f987f822900 0 > filestore(/var/lib/ceph/osd/ceph-4) backend btrfs (magic 0x9123683e) > 2015-04-09 16:31:44.807613 7f987f822900 0 > genericfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_features: FIEMAP > ioctl is supported and appears to work > 2015-04-09 16:31:44.807673 7f987f822900 0 > genericfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_features: FIEMAP > ioctl is disabled via 'filestore fiemap' config opt\ > ion > 2015-04-09 16:31:45.148028 7f987f822900 0 > genericfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_features: syncfs(2) > syscall fully supported (by glibc and kernel) > 2015-04-09 16:31:45.148163 7f987f822900 0 > btrfsfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_feature: CLONE_RANGE > ioctl is supported > 2015-04-09 16:31:45.923009 7f987f822900 0 > btrfsfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_feature: SNAP_CREATE > is supported > 2015-04-09 16:31:45.923673 7f987f822900 0 > btrfsfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_feature: SNAP_DESTROY > is supported > 2015-04-09 16:31:45.923979 7f987f822900 0 > btrfsfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_feature: START_SYNC > is supported (transid 372081) > 2015-04-09 16:31:46.381367 7f987f822900 0 > btrfsfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_feature: WAIT_SYNC is > supported > 2015-04-09 16:31:46.724449 7f987f822900 0 > btrfsfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_feature: > SNAP_CREATE_V2 is supported > 2015-04-09 16:31:47.473175 7f987f822900 0 > filestore(/var/lib/ceph/osd/ceph-4) mount: enabling PARALLEL journal mode: > fs, checkpoint is enabled > HDIO_DRIVE_CMD(identify) failed: Invalid argument > 2015-04-09 16:31:47.495711 7f987f822900 1 journal _open > /var/lib/ceph/osd/ceph-4/journal fd 16: 1072693248 bytes, block size 4096 > bytes, directio = 1, aio = 1 > terminate called after throwing an instance of > 'ceph::buffer::malformed_input' > what(): buffer::malformed_input: __PRETTY_FUNCTION__ unknown encoding > version > 8 > *** Caught signal (Aborted) ** > in thread 7f987f822900 > ceph version 0.92 (00a3ac3b67d93860e7f0b6e07319f11b14d0fec0) > 1: ceph-osd() [0xac511a] > 2: (()+0x10340) [0x7f987e4da340] > 3: (gsignal()+0x39) [0x7f987c979cc9] > 4: (abort()+0x148) [0x7f987c97d0d8] > > > On Thu, Apr 9, 2015 at 3:22 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote: >> >> On Thu, Apr 9, 2015 at 2:05 PM, Dirk Grunwald >> <Dirk.Grunwald@xxxxxxxxxxxx> wrote: >> > Ceph cluster, U14.10 base system, OSD's using BTRFS, journal on same >> > disk as >> > partition >> > (done using ceph-deploy) >> > >> > I had been running 0.92 without (significant) issue. I upgraded >> > to Hammer (0.94) be modifying /etc/apt/sources.list, apt-get update, >> > apt-get >> > upgrade >> > >> > Upgraded and restarted ceph-mon and then ceph-osd >> > >> > Most of the 50 OSD's are in a failure cycle with the error >> > "os/Transaction.cc: 504: FAILED assert(ops == data.ops)" >> > >> > Right now, the entire cluster is useless because of this. >> > >> > Any suggestions? >> >> It looks like maybe it's under the v80.x section instead of general >> upgrading, but the release notes include: >> >> * If you are upgrading specifically from v0.92, you must stop all OSD >> daemons and flush their journals (``ceph-osd -i NNN >> --flush-journal``) before upgrading. There was a transaction >> encoding bug in v0.92 that broke compatibility. Upgrading from v0.93, >> v0.91, or anything earlier is safe. >> > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com