fast luminous -> nautilus -> octopus upgrade could lead to assertion failure on OSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



hi folks,

if you are upgrading from luminous to octopus, or you plan to do so,
please read on.

in octopus, osd will crash if it processes an osdmap whose
require_osd_release flag is still luminous.

this only happens if a cluster upgrades very quickly from luminous to
nautilus and to octopus. in this case, there are good chances that an
octopus OSD will need to consume osdmaps which were created back in
luminous. because we assumed that ceph did not ugprade across major
releases, in octopus, OSD will panic at seeing osdmap from luminous.
this is a known bug[0], and already fixed in master. and the next
octopus release will include the fix to address this issue. as a
workaround, you need to wait a while after running

ceph osd require-osd-release nautilus

and optionally inject lots of osdmaps into cluster to ensure that the
old luminous osd maps are trimmed:

for i in `seq 500`; do
  ceph osd blacklist add 192.168.0.1
  ceph osd blacklist rm 192.168.0.1
done

after the whole cluster are active+clean, then upgrade to octopus.

happy upgrading!


cheers,

--
[0] https://tracker.ceph.com/issues/44759

-- 
Regards
Kefu Chai
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux