Hi together, for those reading along: We had to turn off all OSDs keeping our cephfs-data pool during the intervention, luckily everything came back fine. However, we managed to leave the MDS's and OSDs keeping the cephfs-metadata pool and the MONs online. We restarted those sequentially afterwards, though. So this probably means we are not affected by the upgrade bug - still, I would sleep better if somebody can confirm how to detected this bug and - if you are affected - how to edit the pool to fix it. Cheers, Oliver On 2019-09-17 21:23, Oliver Freyermuth wrote:
Hi together, it seems the issue described by Ansgar was reported and closed here as being fixed for newly created pools in post-Luminous releases: https://tracker.ceph.com/issues/41336 However, it is unclear to me: - How to find out if an EC cephfs you have created in Luminous is actually affected, before actually testing the "shutdown all" procedure, and thus having dying OSDs. - If affected, how to fix it without purging the pool completely (which is not so easily done if there is 0.5 PB inside, which can't be restored without a long downtime). If this is an acknowledged issue, it should probably also be mentioned in the upgrade notes from pre-Mimic to Mimic and newer before more people lose data. In our case, we have such a a CephFS on an EC pool created with Luminous, and are right now running Mimic 13.2.6, but never tried a "full shutdown". We need to try that on Friday, though... (cooling system maintenance). "osd dump" contains: ---------------------------------------------------- pool 1 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 40903 flags hashpspool stripe_width 0 compression_algorithm snappy compression_mode aggressive application cephfs pool 2 'cephfs_data' erasure size 6 min_size 5 crush_rule 2 object_hash rjenkins pg_num 4096 pgp_num 4096 last_change 40953 flags hashpspool,ec_overwrites,selfmanaged_snaps stripe_width 16384 compression_algorithm snappy compression_mode aggressive application cephfs ---------------------------------------------------- and the EC profile is: ---------------------------------------------------- # ceph osd erasure-code-profile get cephfs_data crush-device-class=hdd crush-failure-domain=host crush-root=default jerasure-per-chunk-alignment=false k=4 m=2 plugin=jerasure technique=reed_sol_van w=8 ---------------------------------------------------- Neither contains the stripe_unit explicitly, so I wonder how to find out if it is (in)valid. Checking the xattr ceph.file.layout.stripe_unit of some "old" files on the FS reveals 4194304 in my case. Any help appreciated. Cheers and all the best, Oliver Am 09.08.19 um 08:54 schrieb Ansgar Jazdzewski:We got our OSD's back Since we removed the EC-Pool (cephfs.data) we had to figure out how to remove the PG from teh Offline OSD and hier is how we did it. remove cehfs, remove cache layer, remove pools: #ceph mds fail 0 #ceph fs rm cephfs --yes-i-really-mean-it #ceph osd tier remove-overlay cephfs.data there is now (or already was) no overlay for 'cephfs.data' #ceph osd tier remove cephfs.data cephfs.cache pool 'cephfs.cache' is now (or already was) not a tier of 'cephfs.data' #ceph tell mon.\* injectargs '--mon-allow-pool-delete=true' #ceph osd pool delete cephfs.cache cephfs.cache --yes-i-really-really-mean-it pool 'cephfs.cache' removed #ceph osd pool delete cephfs.data cephfs.data --yes-i-really-really-mean-it pool 'cephfs.data' removed #ceph osd pool delete cephfs.metadata cephfs.metadata --yes-i-really-really-mean-it pool 'cephfs.metadata' removed remove placement groups of pool 23 (cephfs.data) from all offline OSDs: DATAPATH=/var/lib/ceph/osd/ceph-${OSD} a=`ceph-objectstore-tool --data-path ${DATAPATH} --op list-pgs | grep "^23\."` for i in $a; do echo "INFO: removing ${i} from OSD ${OSD}" ceph-objectstore-tool --data-path ${DATAPATH} --pgid ${i} --op remove --force done since we now had removed our cephfs we still not know if we could have solved it without data loss by upgrading to nautilus. Have a nice Weekend, Ansgar Am Mi., 7. Aug. 2019 um 17:03 Uhr schrieb Ansgar Jazdzewski <a.jazdzewski@xxxxxxxxxxxxxx>:another update, we now took the more destructive route and removed the cephfs pools (lucky we had only test date in the filesystem) Our hope was that within the startup-process the osd will delete the no longer needed PG, But this is NOT the Case. So we are still have the same issue the only difference is that the PG does not belong to a pool anymore. -360> 2019-08-07 14:52:32.655 7fb14db8de00 5 osd.44 pg_epoch: 196586 pg[23.f8s0(unlocked)] enter Initial -360> 2019-08-07 14:52:32.659 7fb14db8de00 -1 /build/ceph-13.2.6/src/osd/ECUtil.h: In function 'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)' thread 7fb14db8de00 time 2019-08-07 14:52:32.660169 /build/ceph-13.2.6/src/osd/ECUtil.h: 34: FAILED assert(stripe_width % stripe_size == 0) we now can take one rout and try to delete the pg by hand in the OSD (bluestore) how this can be done? OR we try to upgrade to Nautilus and hope for the beset. any help hints are welcome, have a nice one Ansgar Am Mi., 7. Aug. 2019 um 11:32 Uhr schrieb Ansgar Jazdzewski <a.jazdzewski@xxxxxxxxxxxxxx>:Hi, as a follow-up: * a full log of one OSD failing to start https://pastebin.com/T8UQ2rZ6 * our ec-pool cration in the fist place https://pastebin.com/20cC06Jn * ceph osd dump and ceph osd erasure-code-profile get cephfs https://pastebin.com/TRLPaWcH as we try to dig more into it, it looks like a bug in the cephfs or erasure-coding part of ceph. Ansgar Am Di., 6. Aug. 2019 um 14:50 Uhr schrieb Ansgar Jazdzewski <a.jazdzewski@xxxxxxxxxxxxxx>:hi folks, we had to move one of our clusters so we had to boot all servers, now we found an Error on all OSD with the EC-Pool. do we miss some opitons, will an upgrade to 13.2.6 help? Thanks, Ansgar 2019-08-06 12:10:16.265 7fb337b83200 -1 /build/ceph-13.2.4/src/osd/ECUtil.h: In function 'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)' thread 7fb337b83200 time 2019-08-06 12:10:16.263025 /build/ceph-13.2.4/src/osd/ECUtil.h: 34: FAILED assert(stripe_width % stripe_size == 0) ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable) 1: (ceph::ceph_assert_fail(char const, char const, int, char const)+0x102) [0x7fb32eeb83c2] 2: (()+0x2e5587) [0x7fb32eeb8587] 3: (ECBackend::ECBackend(PGBackend::Listener, coll_t const&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore, CephContext, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned long)+0x4de) [0xa4cbbe] 4: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std ::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&, PGBackend::Listener, coll_t, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore, CephContext)+0x2f9 ) [0x9474e9] 5: (PrimaryLogPG::PrimaryLogPG(OSDService, std::shared_ptr<OSDMap const>, PGPool const&, std::map<std::cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::cxx11::basic_string<char, std::char_tra its<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&, spg_t)+0x138) [0x8f96e8] 6: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x11d3) [0x753553] 7: (OSD::load_pgs()+0x4a9) [0x758339] 8: (OSD::init()+0xcd3) [0x7619c3] 9: (main()+0x3678) [0x64d6a8] 10: (libc_start_main()+0xf0) [0x7fb32ca68830] 11: (_start()+0x29) [0x717389] NOTE: a copy of the executable, or objdump -rdS <executable> is needed to interpret this._______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com