OSD wont start after moving to a new node with ceph 12.2.10

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, I am facing a problem where a OSD wont start after moving to a new node with 12.2.10 (the old one has 12.2.8)

I have one node of my cluster failed and trued to move 3 osds to a new node. 2 of the 3 osds has started and is running fine at the moment (backfiling is still in place.) but one of the osds just dont start with the following error on the logs (writing mostly to try to find if this is a bug or if have I done something wrong):

2018-11-27 19:44:38.013454 7fba0d35fd80 -1 bluestore(/var/lib/ceph/osd/ceph-1) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x0, got 0xb1a184d1, expected 0xb682fc52, device location [0x10000~1000], logical extent 0x0~1000, object #-1:7b3f43c4:::osd_superblock:0# 2018-11-27 19:44:38.013501 7fba0d35fd80 -1 osd.1 0 OSD::init() : unable to read osd superblock 2018-11-27 19:44:38.013511 7fba0d35fd80  1 bluestore(/var/lib/ceph/osd/ceph-1) umount 2018-11-27 19:44:38.065478 7fba0d35fd80  1 stupidalloc 0x0x55ebb04c3f80 shutdown
2018-11-27 19:44:38.077261 7fba0d35fd80  1 freelist shutdown
2018-11-27 19:44:38.077316 7fba0d35fd80  4 rocksdb: [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.10/rpm/el7/BUILD/ceph-12.2.10/src/rocksdb/db/db_impl.cc:217] Shutdown: canceling all background work 2018-11-27 19:44:38.077982 7fba0d35fd80  4 rocksdb: [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.10/rpm/el7/BUILD/ceph-12.2.10/src/rocksdb/db/db_impl.cc:343] Shutdown complete
2018-11-27 19:44:38.107923 7fba0d35fd80  1 bluefs umount
2018-11-27 19:44:38.108248 7fba0d35fd80  1 stupidalloc 0x0x55ebb01cddc0 shutdown 2018-11-27 19:44:38.108302 7fba0d35fd80  1 bdev(0x55ebb01cf800 /var/lib/ceph/osd/ceph-1/block) close 2018-11-27 19:44:38.362984 7fba0d35fd80  1 bdev(0x55ebb01cf600 /var/lib/ceph/osd/ceph-1/block) close 2018-11-27 19:44:38.470791 7fba0d35fd80 -1  ** ERROR: osd init failed: (22) Invalid argument

My cluster has too many mixed versions, I havent realized that the versions is changed when running a yum update and righ now I have the following situation:ceph versions
{
    "mon": {
        "ceph version 12.2.7 (3ec878d1e53e1aeb47a9f619c49d9e7c0aa384d5) luminous (stable)": 1,         "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) luminous (stable)": 2
    },
    "mgr": {
        "ceph version 12.2.7 (3ec878d1e53e1aeb47a9f619c49d9e7c0aa384d5) luminous (stable)": 1
    },
    "osd": {
        "ceph version 12.2.10 (177915764b752804194937482a39e95e0ca3de94) luminous (stable)": 2,         "ceph version 12.2.7 (3ec878d1e53e1aeb47a9f619c49d9e7c0aa384d5) luminous (stable)": 18,         "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) luminous (stable)": 27,         "ceph version 12.2.9 (9e300932ef8a8916fb3fda78c58691a6ab0f4217) luminous (stable)": 1
    },
    "mds": {},
    "overall": {
        "ceph version 12.2.10 (177915764b752804194937482a39e95e0ca3de94) luminous (stable)": 2,         "ceph version 12.2.7 (3ec878d1e53e1aeb47a9f619c49d9e7c0aa384d5) luminous (stable)": 20,         "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) luminous (stable)": 29,         "ceph version 12.2.9 (9e300932ef8a8916fb3fda78c58691a6ab0f4217) luminous (stable)": 1
    }
}

Is there an easy way to get the OSD working again? I am thinking about waiting the backfill/recovery to finish and them upgrade all nodes to 12.2.10 and if the OSD dont come up, recreating the OSD.

Regards,
Cassiano Pilipavicius.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux