On Fri, May 14, 2021 at 10:47 AM Andrius Jurkus <andrius.jurkus@xxxxxxxxxx> wrote: > > Hello, I will try to keep it sad and short :) :( PS sorry if this > dublicate I tried post it from web also. > > Today I upgraded from 16.2.3 to 16.2.4 and added few hosts and osds. > After data migration for few hours, 1 SSD failed, then another and > another 1 by 1. Now I have cluster in pause and 5 failed SSD's, same > host has SSD and HDD, but only SSD's are failing so I think this has to > be balancing refiling or something bug and probably not upgrade bug. > > Cluster has been in pause for 4 hours and no more OSD's are failing. > > full trace > https://pastebin.com/UxbfFYpb This looks very similar to https://tracker.ceph.com/issues/50656. Adding Igor for more ideas. Neha > > Now I m googling and learning but, Is there a way how to easily test > lets say 15.2.XX version on osd without losing anything? > > Any help would be appreciated. > > Error start like this > > May 14 16:58:52 dragon-ball-radar systemd[1]: Starting Ceph osd.2 for > 4e01640b-951b-4f75-8dca-0bad4faf1b11... > May 14 16:58:53 dragon-ball-radar podman[113650]: 2021-05-14 > 16:58:53.057836433 +0000 UTC m=+0.454352919 container create > 3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd > (image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949, > name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate, > GIT_BRANCH=HEAD, maintainer=D > May 14 16:58:53 dragon-ball-radar systemd[1]: Started libcrun container. > May 14 16:58:53 dragon-ball-radar podman[113650]: 2021-05-14 > 16:58:53.3394116 +0000 UTC m=+0.735928098 container init > 3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd > (image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949, > name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate, > maintainer=Dimitri Savineau <dsav > May 14 16:58:53 dragon-ball-radar podman[113650]: 2021-05-14 > 16:58:53.446921192 +0000 UTC m=+0.843437626 container start > 3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd > (image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949, > name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate, > GIT_BRANCH=HEAD, org.label-sch > May 14 16:58:53 dragon-ball-radar podman[113650]: 2021-05-14 > 16:58:53.447050119 +0000 UTC m=+0.843566553 container attach > 3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd > (image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949, > name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate, > org.label-schema.name=CentOS > May 14 16:58:53 dragon-ball-radar bash[113558]: Running command: > /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-2 > May 14 16:58:53 dragon-ball-radar bash[113558]: Running command: > /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev > /dev/ceph-45e6ef2e-fbdc-4289-a900-3d1ffc81ee14/osd-block-973cfe73-06c8-4ea0-9aea-1361d063eb25 > --path /var/lib/ceph/osd/ceph-2 --no-mon-config > May 14 16:58:53 dragon-ball-radar bash[113558]: Running command: > /usr/bin/ln -snf > /dev/ceph-45e6ef2e-fbdc-4289-a900-3d1ffc81ee14/osd-block-973cfe73-06c8-4ea0-9aea-1361d063eb25 > /var/lib/ceph/osd/ceph-2/block > May 14 16:58:53 dragon-ball-radar bash[113558]: Running command: > /usr/bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-2/block > May 14 16:58:53 dragon-ball-radar bash[113558]: Running command: > /usr/bin/chown -R ceph:ceph /dev/dm-1 > May 14 16:58:53 dragon-ball-radar bash[113558]: Running command: > /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-2 > May 14 16:58:53 dragon-ball-radar bash[113558]: --> ceph-volume lvm > activate successful for osd ID: 2 > May 14 16:58:53 dragon-ball-radar podman[113650]: 2021-05-14 > 16:58:53.8147653 +0000 UTC m=+1.211281741 container died > 3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd > (image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949, > name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate) > May 14 16:58:55 dragon-ball-radar podman[113650]: 2021-05-14 > 16:58:55.044964534 +0000 UTC m=+2.441480996 container remove > 3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd > (image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949, > name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate, > CEPH_POINT_RELEASE=-16.2.4, R > May 14 16:58:55 dragon-ball-radar podman[113909]: 2021-05-14 > 16:58:55.594265612 +0000 UTC m=+0.369978347 container create > 31364008fcb8b290643d6e892fba16d19618f5682f590373feabed23061749da > (image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949, > name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2, RELEASE=HEAD, > org.label-schema.build-d > May 14 16:58:55 dragon-ball-radar podman[113909]: 2021-05-14 > 16:58:55.864589286 +0000 UTC m=+0.640302021 container init > 31364008fcb8b290643d6e892fba16d19618f5682f590373feabed23061749da > (image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949, > name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2, > org.label-schema.schema-version=1.0, GIT > May 14 16:58:55 dragon-ball-radar conmon[113957]: debug > 2021-05-14T16:58:55.896+0000 7fcf16aa2080 0 set uid:gid to 167:167 > (ceph:ceph) > May 14 16:58:55 dragon-ball-radar conmon[113957]: debug > 2021-05-14T16:58:55.896+0000 7fcf16aa2080 0 ceph version 16.2.4 > (3cbe25cde3cfa028984618ad32de9edc4c1eaed0) pacific (stable), process > ceph-osd, pid 2 > May 14 16:58:55 dragon-ball-radar conmon[113957]: debug > 2021-05-14T16:58:55.896+0000 7fcf16aa2080 0 pidfile_write: ignore empty > --pid-file > May 14 16:58:55 dragon-ball-radar conmon[113957]: debug > 2021-05-14T16:58:55.896+0000 7fcf16aa2080 1 bdev(0x564ad3a8c800 > /var/lib/ceph/osd/ceph-2/block) open path /var/lib/ceph/osd/ceph-2/block > May 14 16:58:55 dragon-ball-radar conmon[113957]: debug > 2021-05-14T16:58:55.900+0000 7fcf16aa2080 1 bdev(0x564ad3a8c800 > /var/lib/ceph/osd/ceph-2/block) open size 500103643136 (0x7470800000, > 466 GiB) block_size 4096 (4 KiB) non-rotational discard supported > May 14 16:58:55 dragon-ball-radar conmon[113957]: debug > 2021-05-14T16:58:55.900+0000 7fcf16aa2080 1 > bluestore(/var/lib/ceph/osd/ceph-2) _set_cache_sizes cache_size > 3221225472 meta 0.45 kv 0.45 data 0.06 > May 14 16:58:55 dragon-ball-radar conmon[113957]: debug > 2021-05-14T16:58:55.900+0000 7fcf16aa2080 1 bdev(0x564ad3a8cc00 > /var/lib/ceph/osd/ceph-2/block) open path /var/lib/ceph/osd/ceph-2/block > May 14 16:58:55 dragon-ball-radar conmon[113957]: debug > 2021-05-14T16:58:55.900+0000 7fcf16aa2080 1 bdev(0x564ad3a8cc00 > /var/lib/ceph/osd/ceph-2/block) open size 500103643136 (0x7470800000, > 466 GiB) block_size 4096 (4 KiB) non-rotational discard supported > May 14 16:58:55 dragon-ball-radar conmon[113957]: debug > 2021-05-14T16:58:55.900+0000 7fcf16aa2080 1 bluefs add_block_device bdev > 1 path /var/lib/ceph/osd/ceph-2/block size 466 GiB > May 14 16:58:55 dragon-ball-radar conmon[113957]: debug > 2021-05-14T16:58:55.900+0000 7fcf16aa2080 1 bdev(0x564ad3a8cc00 > /var/lib/ceph/osd/ceph-2/block) close > May 14 16:58:55 dragon-ball-radar podman[113909]: 2021-05-14 > 16:58:55.972267166 +0000 UTC m=+0.747979911 container start > 31364008fcb8b290643d6e892fba16d19618f5682f590373feabed23061749da > (image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949, > name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2, ceph=True, > GIT_REPO=https://github.com/ > May 14 16:58:55 dragon-ball-radar bash[113558]: > 31364008fcb8b290643d6e892fba16d19618f5682f590373feabed23061749da > May 14 16:58:55 dragon-ball-radar systemd[1]: Started Ceph osd.2 for > 4e01640b-951b-4f75-8dca-0bad4faf1b11. > May 14 16:58:56 dragon-ball-radar conmon[113957]: debug > 2021-05-14T16:58:56.184+0000 7fcf16aa2080 1 bdev(0x564ad3a8c800 > /var/lib/ceph/osd/ceph-2/block) close > May 14 16:58:56 dragon-ball-radar conmon[113957]: debug > 2021-05-14T16:58:56.444+0000 7fcf16aa2080 1 objectstore numa_node 0 > May 14 16:58:56 dragon-ball-radar conmon[113957]: debug > 2021-05-14T16:58:56.444+0000 7fcf16aa2080 0 starting osd.2 osd_data > /var/lib/ceph/osd/ceph-2 /var/lib/ceph/osd/ceph-2/journal > May 14 16:58:56 dragon-ball-radar conmon[113957]: debug > 2021-05-14T16:58:56.444+0000 7fcf16aa2080 -1 unable to find any IPv4 > address in networks '10.0.199.0/24' interfaces '' > May 14 16:58:56 dragon-ball-radar conmon[113957]: debug > 2021-05-14T16:58:56.444+0000 7fcf16aa2080 -1 unable to find any IPv4 > address in networks '172.16.199.0/24' interfaces '' > May 14 16:58:56 dragon-ball-radar conmon[113957]: debug > 2021-05-14T16:58:56.452+0000 7fcf16aa2080 0 load: jerasure load: lrc > load: isa > May 14 16:58:56 dragon-ball-radar conmon[113957]: debug > 2021-05-14T16:58:56.456+0000 7fcf16aa2080 1 bdev(0x564ad476e400 > /var/lib/ceph/osd/ceph-2/block) open path /var/lib/ceph/osd/ceph-2/block > May 14 16:58:56 dragon-ball-radar conmon[113957]: debug > 2021-05-14T16:58:56.456+0000 7fcf16aa2080 1 bdev(0x564ad476e400 > /var/lib/ceph/osd/ceph-2/block) open size 500103643136 (0x7470800000, > 466 GiB) block_size 4096 (4 KiB) non-rotational discard supported > May 14 16:58:56 dragon-ball-radar conmon[113957]: debug > 2021-05-14T16:58:56.456+0000 7fcf16aa2080 1 > bluestore(/var/lib/ceph/osd/ceph-2) _set_cache_sizes cache_size > 3221225472 meta 0.45 kv 0.45 data 0.06 > May 14 16:58:56 dragon-ball-radar conmon[113957]: debug > 2021-05-14T16:58:56.456+0000 7fcf16aa2080 1 bdev(0x564ad476e400 > /var/lib/ceph/osd/ceph-2/block) close > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx