Disk fail, some question...

Marco Gaiarin <gaio@xxxxxxxxx> · Tue, 7 Jan 2020 12:08:07 +0100

Happy new year to all!

In these holidays i've suffered a disk failure, but hitted also an
'inconsistent pg' error, and i want to understand.

Ceph 12.2.12, filestore.

Starting from 27/12 i get classical disk error:

 Dec 27 20:52:21 capitanmarvel kernel: [345907.286795] ata1.00: exception Emask 0x0 SAct 0xfe00000 SErr 0x0 action 0x0
 Dec 27 20:52:21 capitanmarvel kernel: [345907.286849] ata1.00: irq_stat 0x40000008
 Dec 27 20:52:21 capitanmarvel kernel: [345907.286880] ata1.00: failed command: READ FPDMA QUEUED
 Dec 27 20:52:21 capitanmarvel kernel: [345907.286920] ata1.00: cmd 60/00:a8:20:87:3b/04:00:00:00:00/40 tag 21 ncq dma 524288 in
 Dec 27 20:52:21 capitanmarvel kernel: [345907.286920]          res 41/40:00:46:8a:3b/00:00:00:00:00/40 Emask 0x409 (media error) <F>
 Dec 27 20:52:21 capitanmarvel kernel: [345907.287018] ata1.00: status: { DRDY ERR }
 Dec 27 20:52:21 capitanmarvel kernel: [345907.287046] ata1.00: error: { UNC }
 Dec 27 20:52:21 capitanmarvel kernel: [345907.288676] ata1.00: configured for UDMA/133
 Dec 27 20:52:21 capitanmarvel kernel: [345907.288698] sd 1:0:0:0: [sdc] tag#21 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
 Dec 27 20:52:21 capitanmarvel kernel: [345907.288702] sd 1:0:0:0: [sdc] tag#21 Sense Key : Medium Error [current]
 Dec 27 20:52:21 capitanmarvel kernel: [345907.288705] sd 1:0:0:0: [sdc] tag#21 Add. Sense: Unrecovered read error - auto reallocate failed
 Dec 27 20:52:21 capitanmarvel kernel: [345907.288708] sd 1:0:0:0: [sdc] tag#21 CDB: Read(10) 28 00 00 3b 87 20 00 04 00 00
 Dec 27 20:52:21 capitanmarvel kernel: [345907.288711] print_req_error: I/O error, dev sdc, sector 3902022

but also:

 Dec 27 20:52:24 capitanmarvel ceph-osd[3852]: 2019-12-27 20:52:24.714716 7f821fbfd700 -1 log_channel(cluster) log [ERR] : 4.9b missing primary copy of 4:d97871c4:::rbd_data.142b816b8b4567.0000000000012ae1:head, will try copies on 8,14

OSD 'flip-flop' a bit for some days. At first scrub, i got:

  cluster:
    id:     8794c124-c2ec-4e81-8631-742992159bd6
    health: HEALTH_ERR
            1 scrub errors
            Possible data damage: 1 pg inconsistent

  services:
    mon: 5 daemons, quorum blackpanther,capitanmarvel,4,2,3
    mgr: hulk(active), standbys: blackpanther, deadpool, thor, capitanmarvel
    osd: 12 osds: 12 up, 12 in

  data:
    pools:   3 pools, 768 pgs
    objects: 671.04k objects, 2.54TiB
    usage:   7.62TiB used, 9.66TiB / 17.3TiB avail
    pgs:     766 active+clean
             1   active+clean+inconsistent
             1   active+clean+scrubbing+deep

finally, OSD die, and so i got (after automatic remapping):

  cluster:
    id:     8794c124-c2ec-4e81-8631-742992159bd6
    health: HEALTH_ERR
            1 scrub errors
            Possible data damage: 1 pg inconsistent

  services:
    mon: 5 daemons, quorum blackpanther,capitanmarvel,4,2,3
    mgr: hulk(active), standbys: blackpanther, deadpool, thor, capitanmarvel
    osd: 12 osds: 11 up, 11 in

  data:
    pools:   3 pools, 768 pgs
    objects: 674.26k objects, 2.55TiB
    usage:   7.65TiB used, 8.71TiB / 16.4TiB avail
    pgs:     767 active+clean
             1   active+clean+inconsistent

To fix the issue i've tried to read the docs (looking for
'OSD_SCRUB_ERRORS'), finding:
	https://docs.ceph.com/docs/doc-12.2.0-major-changes/rados/operations/health-checks/

but the link within is empty:
	https://docs.ceph.com/docs/doc-12.2.0-major-changes/rados/operations/pg-repair/

and after fiddling a bit with google, i've found:
	https://ceph.io/geen-categorie/ceph-manually-repair-object/

that permit me to fix the issue easily with 'ceph pg repair'.

Two question:

1) the missing page on 'pg-repair' is a bug of documentation? There's
 something i can do?

2) what happens?
 - While, if the OSD was not able to write data to the OSD, they are
   not automatically relocated to other OSD? This violate the crushmap?
 - while, when the failing OSD get out, the inconsistent PG get not
   automatically fixed? I've count=3, the other 2 copies are not
   coherent? But, if so, how ceph was able to fix them?

Sorry... and thanks. ;)

-- 
dott. Marco Gaiarin				        GNUPG Key ID: 240A3D66
  Associazione ``La Nostra Famiglia''          http://www.lanostrafamiglia.it/
  Polo FVG   -   Via della Bontà, 7 - 33078   -   San Vito al Tagliamento (PN)
  marco.gaiarin(at)lanostrafamiglia.it   t +39-0434-842711   f +39-0434-842797

		Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA!
      http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000
	(cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA)
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx