Looks like that drive is dying. On Wed, Apr 22, 2020 at 12:25 AM Andras Pataki <apataki@xxxxxxxxxxxxxxxxxxxxx> wrote: > > Hi Brad, > > Indeed - osd.694 kept crashing with a read error (medium error on the > hard drive), and got restarted by systemd. So net net the system ended > up in an infinite loop of deep scrub attempts on the PG for a week. > Typically when a scrub encounters a read error, I get an inconsistent > placement group, not an OSD crash and an infinite loop of scrub > attempts. With the inconsistent placement group a pg repair fixes the > read error (by reallocating the sector inside the drive). > > Here is the stack trace of the osd.694 crash on the scrub read error: > > Apr 19 03:39:17 popeye-oss-3-03 kernel: sd 14:0:27:0: [sdz] tag#1 FAILED > Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > Apr 19 03:39:17 popeye-oss-3-03 kernel: sd 14:0:27:0: [sdz] tag#1 Sense > Key : Medium Error [current] [descriptor] > Apr 19 03:39:17 popeye-oss-3-03 kernel: sd 14:0:27:0: [sdz] tag#1 Add. > Sense: Unrecovered read error > Apr 19 03:39:17 popeye-oss-3-03 kernel: sd 14:0:27:0: [sdz] tag#1 CDB: > Read(10) 28 00 6e a9 7a 30 00 00 80 00 > Apr 19 03:39:17 popeye-oss-3-03 kernel: print_req_error: critical medium > error, dev sdz, sector 14852804992 > Apr 19 03:39:17 popeye-oss-3-03 ceph-osd: 2020-04-19 03:39:17.095 > 7fffd2e2b700 -1 bluestore(/var/lib/ceph/osd/ceph-694) _do_read bdev-read > failed: (61) No data available > Apr 19 03:39:17 popeye-oss-3-03 ceph-osd: > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.8/rpm/el7/BUILD/ceph-14.2.8/src/os/bluestore/BlueStore.cc: > In function 'int BlueStore::_do_read(BlueStore::Collection*, > BlueStore::OnodeRef, uint64_t, size_t, ceph::bufferlist&, uint32_t, > uint64_t)' thread 7fffd2e2b700 time 2020-04-19 03:39:17.099677 > Apr 19 03:39:17 popeye-oss-3-03 ceph-osd: > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.8/rpm/el7/BUILD/ceph-14.2.8/src/os/bluestore/BlueStore.cc: > 9214: FAILED ceph_assert(r == 0) > Apr 19 03:39:17 popeye-oss-3-03 ceph-osd: ceph version 14.2.8 > (2d095e947a02261ce61424021bb43bd3022d35cb) nautilus (stable) > Apr 19 03:39:17 popeye-oss-3-03 ceph-osd: 1: > (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x14a) [0x555555a1ea4d] > Apr 19 03:39:17 popeye-oss-3-03 ceph-osd: 2: (()+0x4cac15) [0x555555a1ec15] > Apr 19 03:39:17 popeye-oss-3-03 ceph-osd: 3: > (BlueStore::_do_read(BlueStore::Collection*, > boost::intrusive_ptr<BlueStore::Onode>, unsigned long, unsigned long, > ceph::buffer::v14_2_0::list&, unsigned int, unsigned long)+0x3512) > [0x555555f64132] > Apr 19 03:39:17 popeye-oss-3-03 ceph-osd: 4: > (BlueStore::read(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, > ghobject_t const&, unsigned long, unsigned long, > ceph::buffer::v14_2_0::list&, unsigned int)+0x1b8) [0x555555f64778] > Apr 19 03:39:17 popeye-oss-3-03 ceph-osd: 5: > (ReplicatedBackend::be_deep_scrub(hobject_t const&, ScrubMap&, > ScrubMapBuilder&, ScrubMap::object&)+0x2c2) [0x555555deb832] > Apr 19 03:39:17 popeye-oss-3-03 ceph-osd: 6: > (PGBackend::be_scan_list(ScrubMap&, ScrubMapBuilder&)+0x663) > [0x555555d082c3] > Apr 19 03:39:17 popeye-oss-3-03 ceph-osd: 7: > (PG::build_scrub_map_chunk(ScrubMap&, ScrubMapBuilder&, hobject_t, > hobject_t, bool, ThreadPool::TPHandle&)+0x8b) [0x555555bbaacb] > Apr 19 03:39:17 popeye-oss-3-03 ceph-osd: 8: > (PG::chunky_scrub(ThreadPool::TPHandle&)+0x181c) [0x555555be4fcc] > Apr 19 03:39:17 popeye-oss-3-03 ceph-osd: 9: (PG::scrub(unsigned int, > ThreadPool::TPHandle&)+0x4bb) [0x555555be61db] > Apr 19 03:39:17 popeye-oss-3-03 ceph-osd: 10: (PGScrub::run(OSD*, > OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x12) > [0x555555d8c7b2] > Apr 19 03:39:17 popeye-oss-3-03 ceph-osd: 11: > (OSD::ShardedOpWQ::_process(unsigned int, > ceph::heartbeat_handle_d*)+0x90f) [0x555555b1898f] > Apr 19 03:39:17 popeye-oss-3-03 ceph-osd: 12: > (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5b6) > [0x5555560bd056] > Apr 19 03:39:17 popeye-oss-3-03 ceph-osd: 13: > (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5555560bfb70] > Apr 19 03:39:17 popeye-oss-3-03 ceph-osd: 14: (()+0x7e65) [0x7ffff5025e65] > Apr 19 03:39:17 popeye-oss-3-03 ceph-osd: 15: (clone()+0x6d) > [0x7ffff3ee988d] > > I ended up recreating the OSD (and thus overwriting all data) to fix the > issue. > > Andras > > > On 4/20/20 9:28 PM, Brad Hubbard wrote: > > On Mon, Apr 20, 2020 at 11:01 PM Andras Pataki > > <apataki@xxxxxxxxxxxxxxxxxxxxx> wrote: > >> On a cluster running Nautilus (14.2.8), we are getting a complaint about > >> a PG not being deep-scrubbed on time. Looking at the primary OSD's > >> logs, it looks like it tries to deep-scrub the PG every hour or so, > >> emits some complaints that I don't understand, but the deep scrub does > >> not finish (either with or without a scrub error). > >> > >> Here is the PG from pg dump: > >> > >> 1.43f 31794 0 0 0 0 > >> 66930087214 0 0 3004 3004 > >> active+clean+scrubbing+deep 2020-04-20 04:48:13.055481 46286'483734 > >> 46286:563439 [354,694,851] 354 [354,694,851] 354 > >> 39594'348643 2020-04-10 12:39:16.261088 38482'314644 2020-04-04 > >> 10:37:03.121638 0 > >> > >> Here is a section of the primary OSD logs (osd.354): > >> > >> 2020-04-18 08:21:08.322 7fffd2e2b700 0 log_channel(cluster) log [DBG] : > >> 1.43f deep-scrub starts > > Scrubbing starts. > > > >> 2020-04-18 08:37:53.362 7fffd2e2b700 1 osd.354 pg_epoch: 45910 > >> pg[1.43f( v 45909'449615 (45801'446525,45909'449615] > >> local-lis/les=45908/45909 n=30862 ec=874/871 lis/c 45908/45908 les/c/f > >> 45909/45909/0 45910/45910/42988) [354,851] r=0 lpr=45910 > >> pi=[45908,45910)/1 luod=0'0 crt=45909'449615 lcod 45909'449614 mlcod 0'0 > >> active+scrubbing+deep mbc={}] start_peering_interval up [354,694,851] -> > >> [354,851], acting [354,694,851] -> [354,851], acting_primary 354 -> 354, > >> up_primary 354 -> 354, role 0 -> 0, features acting 4611087854031667199 > >> upacting 4611087854031667199 > >> 2020-04-18 08:37:53.362 7fffd2e2b700 1 osd.354 pg_epoch: 45910 > >> pg[1.43f( v 45909'449615 (45801'446525,45909'449615] > >> local-lis/les=45908/45909 n=30862 ec=874/871 lis/c 45908/45908 les/c/f > >> 45909/45909/0 45910/45910/42988) [354,851] r=0 lpr=45910 > >> pi=[45908,45910)/1 crt=45909'449615 lcod 45909'449614 mlcod 0'0 unknown > >> mbc={}] state<Start>: transitioning to Primary > >> 2020-04-18 08:38:01.002 7fffd2e2b700 1 osd.354 pg_epoch: 45912 > >> pg[1.43f( v 45909'449615 (45801'446525,45909'449615] > >> local-lis/les=45910/45911 n=30862 ec=874/871 lis/c 45910/45908 les/c/f > >> 45911/45909/0 45912/45912/42988) [354,694,851] r=0 lpr=45912 > >> pi=[45908,45912)/1 luod=0'0 crt=45909'449615 lcod 45909'449614 mlcod 0'0 > >> active mbc={}] start_peering_interval up [354,851] -> [354,694,851], > >> acting [354,851] -> [354,694,851], acting_primary 354 -> 354, up_primary > >> 354 -> 354, role 0 -> 0, features acting 4611087854031667199 upacting > >> 4611087854031667199 > > Epoch (and therefore map) change from 45910 to 45912 You lost osd 694 > > from the acting set so peering has to restart and the scrub is > > aborted. > > > >> 2020-04-18 08:38:01.002 7fffd2e2b700 1 osd.354 pg_epoch: 45912 > >> pg[1.43f( v 45909'449615 (45801'446525,45909'449615] > >> local-lis/les=45910/45911 n=30862 ec=874/871 lis/c 45910/45908 les/c/f > >> 45911/45909/0 45912/45912/42988) [354,694,851] r=0 lpr=45912 > >> pi=[45908,45912)/1 crt=45909'449615 lcod 45909'449614 mlcod 0'0 unknown > >> mbc={}] state<Start>: transitioning to Primary > >> 2020-04-18 08:40:04.219 7fffd2e2b700 0 log_channel(cluster) log [DBG] : > >> 1.43f deep-scrub starts > > Scrubbing start again. > > > >> 2020-04-18 08:56:49.095 7fffd2e2b700 1 osd.354 pg_epoch: 45914 > >> pg[1.43f( v 45913'449735 (45812'446725,45913'449735] > >> local-lis/les=45912/45913 n=30868 ec=874/871 lis/c 45912/45912 les/c/f > >> 45913/45913/0 45914/45914/42988) [354,851] r=0 lpr=45914 > >> pi=[45912,45914)/1 luod=0'0 crt=45913'449735 lcod 45913'449734 mlcod 0'0 > >> active+scrubbing+deep mbc={}] start_peering_interval up [354,694,851] -> > >> [354,851], acting [354,694,851] -> [354,851], acting_primary 354 -> 354, > >> up_primary 354 -> 354, role 0 -> 0, features acting 4611087854031667199 > >> upacting 4611087854031667199 > >> 2020-04-18 08:56:49.095 7fffd2e2b700 1 osd.354 pg_epoch: 45914 > >> pg[1.43f( v 45913'449735 (45812'446725,45913'449735] > >> local-lis/les=45912/45913 n=30868 ec=874/871 lis/c 45912/45912 les/c/f > >> 45913/45913/0 45914/45914/42988) [354,851] r=0 lpr=45914 > >> pi=[45912,45914)/1 crt=45913'449735 lcod 45913'449734 mlcod 0'0 unknown > >> mbc={}] state<Start>: transitioning to Primary > >> 2020-04-18 08:56:55.627 7fffd2e2b700 1 osd.354 pg_epoch: 45916 > >> pg[1.43f( v 45913'449735 (45812'446725,45913'449735] > >> local-lis/les=45914/45915 n=30868 ec=874/871 lis/c 45914/45912 les/c/f > >> 45915/45913/0 45916/45916/42988) [354,694,851] r=0 lpr=45916 > >> pi=[45912,45916)/1 luod=0'0 crt=45913'449735 lcod 45913'449734 mlcod 0'0 > >> active mbc={}] start_peering_interval up [354,851] -> [354,694,851], > >> acting [354,851] -> [354,694,851], acting_primary 354 -> 354, up_primary > >> 354 -> 354, role 0 -> 0, features acting 4611087854031667199 upacting > >> 4611087854031667199 > > Same again, you lost osd 694 from the acting set, epoch and map change > > requiring repeering and aborting the scrub. > > > > You need to identify why osd 694 is flapping, or at least appears to > > be flapping to the monitor. > > > > Start by having a careful look at the logs for osd 694. If they > > provide no insight look at the logs on the active monitor for any > > reference to osd 694 around the time the epoch changes. > > > >> 2020-04-18 08:56:55.627 7fffd2e2b700 1 osd.354 pg_epoch: 45916 > >> pg[1.43f( v 45913'449735 (45812'446725,45913'449735] > >> local-lis/les=45914/45915 n=30868 ec=874/871 lis/c 45914/45912 les/c/f > >> 45915/45913/0 45916/45916/42988) [354,694,851] r=0 lpr=45916 > >> pi=[45912,45916)/1 crt=45913'449735 lcod 45913'449734 mlcod 0'0 unknown > >> mbc={}] state<Start>: transitioning to Primary > >> 2020-04-18 08:56:56.867 7fffd2e2b700 0 log_channel(cluster) log [DBG] : > >> 1.43f deep-scrub starts > >> 2020-04-18 09:13:37.419 7fffd2e2b700 1 osd.354 pg_epoch: 45918 > >> pg[1.43f( v 45917'449808 (45812'446725,45917'449808] > >> local-lis/les=45916/45917 n=30876 ec=874/871 lis/c 45916/45916 les/c/f > >> 45917/45917/0 45918/45918/42988) [354,851] r=0 lpr=45918 > >> pi=[45916,45918)/1 luod=0'0 crt=45917'449808 lcod 45917'449807 mlcod 0'0 > >> active+scrubbing+deep mbc={}] start_peering_interval up [354,694,851] -> > >> [354,851], acting [354,694,851] -> [354,851], acting_primary 354 -> 354, > >> up_primary 354 -> 354, role 0 -> 0, features acting 4611087854031667199 > >> upacting 4611087854031667199 > >> 2020-04-18 09:13:37.419 7fffd2e2b700 1 osd.354 pg_epoch: 45918 > >> pg[1.43f( v 45917'449808 (45812'446725,45917'449808] > >> local-lis/les=45916/45917 n=30876 ec=874/871 lis/c 45916/45916 les/c/f > >> 45917/45917/0 45918/45918/42988) [354,851] r=0 lpr=45918 > >> pi=[45916,45918)/1 crt=45917'449808 lcod 45917'449807 mlcod 0'0 unknown > >> mbc={}] state<Start>: transitioning to Primary > >> 2020-04-18 09:13:44.959 7fffd2e2b700 1 osd.354 pg_epoch: 45920 > >> pg[1.43f( v 45917'449808 (45812'446725,45917'449808] > >> local-lis/les=45918/45919 n=30876 ec=874/871 lis/c 45918/45916 les/c/f > >> 45919/45917/0 45920/45920/42988) [354,694,851] r=0 lpr=45920 > >> pi=[45916,45920)/1 luod=0'0 crt=45917'449808 lcod 45917'449807 mlcod 0'0 > >> active mbc={}] start_peering_interval up [354,851] -> [354,694,851], > >> acting [354,851] -> [354,694,851], acting_primary 354 -> 354, up_primary > >> 354 -> 354, role 0 -> 0, features acting 4611087854031667199 upacting > >> 4611087854031667199 > >> 2020-04-18 09:13:44.959 7fffd2e2b700 1 osd.354 pg_epoch: 45920 > >> pg[1.43f( v 45917'449808 (45812'446725,45917'449808] > >> local-lis/les=45918/45919 n=30876 ec=874/871 lis/c 45918/45916 les/c/f > >> 45919/45917/0 45920/45920/42988) [354,694,851] r=0 lpr=45920 > >> pi=[45916,45920)/1 crt=45917'449808 lcod 45917'449807 mlcod 0'0 unknown > >> mbc={}] state<Start>: transitioning to Primary > >> > >> I've tried restarting all 3 OSDs in question, it didn't help. I'm > >> running the upmap balancer on this cluster (if that matters). Any ideas? > >> > >> Andras > >> _______________________________________________ > >> ceph-users mailing list -- ceph-users@xxxxxxx > >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > > > -- Cheers, Brad _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx