Aha. That would have been useful to see -- I saw the notice about 0.93, but not that.
when I roll back to v0.92, I get a different error (see below)
This doesn't seem very happy - any suggestions?
root@zfs2:~/XYZZY/v92# ceph-osd -d -i 4 --flush-journal
2015-04-09 16:31:44.756113 7f987f822900 0 ceph version 0.92 (00a3ac3b67d93860e7f0b6e07319f11b14d0fec0), process ceph-osd, pid 12605
2015-04-09 16:31:44.758743 7f987f822900 0 filestore(/var/lib/ceph/osd/ceph-4) backend btrfs (magic 0x9123683e)
2015-04-09 16:31:44.807613 7f987f822900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_features: FIEMAP ioctl is supported and appears to work
2015-04-09 16:31:44.807673 7f987f822900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config opt\
ion
2015-04-09 16:31:45.148028 7f987f822900 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)
2015-04-09 16:31:45.148163 7f987f822900 0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_feature: CLONE_RANGE ioctl is supported
2015-04-09 16:31:45.923009 7f987f822900 0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_feature: SNAP_CREATE is supported
2015-04-09 16:31:45.923673 7f987f822900 0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_feature: SNAP_DESTROY is supported
2015-04-09 16:31:45.923979 7f987f822900 0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_feature: START_SYNC is supported (transid 372081)
2015-04-09 16:31:46.381367 7f987f822900 0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_feature: WAIT_SYNC is supported
2015-04-09 16:31:46.724449 7f987f822900 0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_feature: SNAP_CREATE_V2 is supported
2015-04-09 16:31:47.473175 7f987f822900 0 filestore(/var/lib/ceph/osd/ceph-4) mount: enabling PARALLEL journal mode: fs, checkpoint is enabled
HDIO_DRIVE_CMD(identify) failed: Invalid argument
2015-04-09 16:31:47.495711 7f987f822900 1 journal _open /var/lib/ceph/osd/ceph-4/journal fd 16: 1072693248 bytes, block size 4096 bytes, directio = 1, aio = 1
terminate called after throwing an instance of 'ceph::buffer::malformed_input'
what(): buffer::malformed_input: __PRETTY_FUNCTION__ unknown encoding version > 8
*** Caught signal (Aborted) **
in thread 7f987f822900
ceph version 0.92 (00a3ac3b67d93860e7f0b6e07319f11b14d0fec0)
1: ceph-osd() [0xac511a]
2: (()+0x10340) [0x7f987e4da340]
3: (gsignal()+0x39) [0x7f987c979cc9]
4: (abort()+0x148) [0x7f987c97d0d8]
On Thu, Apr 9, 2015 at 3:22 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
On Thu, Apr 9, 2015 at 2:05 PM, Dirk Grunwald
<Dirk.Grunwald@xxxxxxxxxxxx> wrote:
> Ceph cluster, U14.10 base system, OSD's using BTRFS, journal on same disk as
> partition
> (done using ceph-deploy)
>
> I had been running 0.92 without (significant) issue. I upgraded
> to Hammer (0.94) be modifying /etc/apt/sources.list, apt-get update, apt-get
> upgrade
>
> Upgraded and restarted ceph-mon and then ceph-osd
>
> Most of the 50 OSD's are in a failure cycle with the error
> "os/Transaction.cc: 504: FAILED assert(ops == data.ops)"
>
> Right now, the entire cluster is useless because of this.
>
> Any suggestions?
It looks like maybe it's under the v80.x section instead of general
upgrading, but the release notes include:
* If you are upgrading specifically from v0.92, you must stop all OSD
daemons and flush their journals (``ceph-osd -i NNN
--flush-journal``) before upgrading. There was a transaction
encoding bug in v0.92 that broke compatibility. Upgrading from v0.93,
v0.91, or anything earlier is safe.
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com