Hi Dan, Just checked again : arggghhh... # grep AUTO_RESTART /etc/sysconfig/ceph CEPH_AUTO_RESTART_ON_UPGRADE=no So no :'( RPMs were upgraded, but OSD were not restarted as I thought. Or at least not restarted with new 12.2.7 binaries (but since the skip digest option was present in the running 12.2.6 OSDs, I guess the 12.2.6 osds did not understand that option) I just restarted all of the OSDs : I will check again the behavior and report here - thanks for pointing me in the good direction ! Fred -----Message d'origine----- De : Dan van der Ster [mailto:dan@xxxxxxxxxxxxxx] Envoyé : mardi 24 juillet 2018 16:50 À : SCHAER Frederic <frederic.schaer@xxxxxx> Cc : ceph-users <ceph-users@xxxxxxxx> Objet : Re: 12.2.7 + osd skip data digest + bluestore + I/O errors `ceph versions` -- you're sure all the osds are running 12.2.7 ? osd_skip_data_digest = true is supposed to skip any crc checks during reads. But maybe the cache tiering IO path is different and checks the crc anyway? -- dan On Tue, Jul 24, 2018 at 3:01 PM SCHAER Frederic <frederic.schaer@xxxxxx> wrote: > > Hi, > > > > I read the 12.2.7 upgrade notes, and set “osd skip data digest = true” before I started upgrading from 12.2.6 on my Bluestore-only cluster. > > As far as I can tell, my OSDs all got restarted during the upgrade and all got the option enabled : > > > > This is what I see for a specific OSD taken at random: > > # ceph --admin-daemon /var/run/ceph/ceph-osd.68.asok config show|grep data_digest > > "osd_skip_data_digest": "true", > > > > This is what I see when I try to injectarg the option data digest ignore option : > > > > # ceph tell osd.* injectargs '--osd_skip_data_digest=true' 2>&1|head > > osd.0: osd_skip_data_digest = 'true' (not observed, change may require restart) > > osd.1: osd_skip_data_digest = 'true' (not observed, change may require restart) > > osd.2: osd_skip_data_digest = 'true' (not observed, change may require restart) > > osd.3: osd_skip_data_digest = 'true' (not observed, change may require restart) > > (…) > > > > This has been like that since I upgraded to 12.2.7. > > I read in the releanotes that the skip_data_digest option should be sufficient to ignore the 12.2.6 corruptions and that objects should auto-heal on rewrite… > > > > However… > > > > My config : > > - Using tiering with an SSD hot storage tier > > - HDDs for cold storage > > > > And… I get I/O errors on some VMs when running some commands as simple as “yum check-update”. > > > > The qemu/kvm/libirt logs show me these (in : /var/log/libvirt/qemu) : > > > > block I/O error in device 'drive-virtio-disk0': Input/output error (5) > > > > In the ceph logs, I can see these errors : > > > > 2018-07-24 11:17:56.420391 osd.71 [ERR] 1.23 copy from 1:c590b9d7:::rbd_data.1920e2238e1f29.00000000000000e7:head to 1:c590b9d7:::rbd_data.1920e2238e1f29.00000000000000e7:head data digest 0x3bb26e16 != source 0xec476c54 > > 2018-07-24 11:17:56.429936 osd.71 [ERR] 1.23 copy from 1:c590b9d7:::rbd_data.1920e2238e1f29.00000000000000e7:head to 1:c590b9d7:::rbd_data.1920e2238e1f29.00000000000000e7:head data digest 0x3bb26e16 != source 0xec476c54 > > > > (yes, my cluster is seen as healthy) > > > > On the affected OSDs, I can see these errors : > > > > 2018-07-24 11:17:56.420349 7f034642a700 -1 osd.71 pg_epoch: 182367 pg[1.23( v 182367'46340724 (182367'46339152,182367'46340724] local-lis/les=182298/182299 n=344 ec=2726/2726 lis/c 182298/182298 les/c/f 182299/182299/0 182298/182298/43896) [71,101,74] r=0 lpr=182298 crt=182367'46340724 lcod 182367'46340723 mlcod 182367'46340723 active+clean] process_copy_chunk data digest 0x3bb26e16 != source 0xec476c54 > > 2018-07-24 11:17:56.420388 7f034642a700 -1 log_channel(cluster) log [ERR] : 1.23 copy from 1:c590b9d7:::rbd_data.1920e2238e1f29.00000000000000e7:head to 1:c590b9d7:::rbd_data.1920e2238e1f29.00000000000000e7:head data digest 0x3bb26e16 != source 0xec476c54 > > 2018-07-24 11:17:56.420395 7f034642a700 -1 osd.71 pg_epoch: 182367 pg[1.23( v 182367'46340724 (182367'46339152,182367'46340724] local-lis/les=182298/182299 n=344 ec=2726/2726 lis/c 182298/182298 les/c/f 182299/182299/0 182298/182298/43896) [71,101,74] r=0 lpr=182298 crt=182367'46340724 lcod 182367'46340723 mlcod 182367'46340723 active+clean] finish_promote unexpected promote error (5) Input/output error > > 2018-07-24 11:17:56.429900 7f034642a700 -1 osd.71 pg_epoch: 182367 pg[1.23( v 182367'46340724 (182367'46339152,182367'46340724] local-lis/les=182298/182299 n=344 ec=2726/2726 lis/c 182298/182298 les/c/f 182299/182299/0 182298/182298/43896) [71,101,74] r=0 lpr=182298 crt=182367'46340724 lcod 182367'46340723 mlcod 182367'46340723 active+clean] process_copy_chunk data digest 0x3bb26e16 != source 0xec476c54 > > 2018-07-24 11:17:56.429934 7f034642a700 -1 log_channel(cluster) log [ERR] : 1.23 copy from 1:c590b9d7:::rbd_data.1920e2238e1f29.00000000000000e7:head to 1:c590b9d7:::rbd_data.1920e2238e1f29.00000000000000e7:head data digest 0x3bb26e16 != source 0xec476c54 > > 2018-07-24 11:17:56.429939 7f034642a700 -1 osd.71 pg_epoch: 182367 pg[1.23( v 182367'46340724 (182367'46339152,182367'46340724] local-lis/les=182298/182299 n=344 ec=2726/2726 lis/c 182298/182298 les/c/f 182299/182299/0 182298/182298/43896) [71,101,74] r=0 lpr=182298 crt=182367'46340724 lcod 182367'46340723 mlcod 182367'46340723 active+clean] finish_promote unexpected promote error (5) Input/output error > > > > And…. I don’t know how to recover from that. > > Pool #1 is my SSD cache tier, hence pg 1.23 is on the SSD side. > > > > I’ve tried setting the cache pool to “readforward” despite the “not well supported” warning and could immediately get back working VMs (no more I/O errors). > > But with no SSD tiering : not really useful. > > > > As soon as I’ve tried setting the cache tier to writeback again, I got those I/O errors again… (not on the yum command, but in the mean time I’ve stopped and set out, then unset out osd.71 to check it with badblocks just in case…) > > I still have to find how to reproduce the io error on an affected host to further try to debug/fix that issue… > > > > Any ideas ? > > > > Thanks && regards > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com