Re: after upgrade to 16.2.3 16.2.4 and after adding few hdd's OSD's started to fail 1 by 1.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



You are welcome! We still need to get to the bottom of this, I will
update the tracker to make a note of this occurrence.

Thanks,
Neha

On Fri, May 14, 2021 at 12:25 PM Andrius Jurkus
<andrius.jurkus@xxxxxxxxxx> wrote:
>
> Big thanks, Much appreciated help.
>
> It probably is same bug.
>
> bluestore_allocator = bitmap
>
> by setting this parameter all failed OSD started.
>
> Thanks again!
>
> On 2021-05-14 21:09, Neha Ojha wrote:
> > On Fri, May 14, 2021 at 10:47 AM Andrius Jurkus
> > <andrius.jurkus@xxxxxxxxxx> wrote:
> >>
> >> Hello, I will try to keep it sad and short :) :(    PS sorry if this
> >> dublicate I tried post it from web also.
> >>
> >> Today I upgraded from 16.2.3 to 16.2.4 and added few hosts and osds.
> >> After data migration for few hours, 1 SSD failed, then another and
> >> another 1 by 1. Now I have cluster in pause and 5 failed SSD's, same
> >> host has SSD and HDD, but only SSD's are failing so I think this has
> >> to
> >> be balancing refiling or something bug and probably not upgrade bug.
> >>
> >> Cluster has been in pause for 4 hours and no more OSD's are failing.
> >>
> >> full trace
> >> https://pastebin.com/UxbfFYpb
> >
> > This looks very similar to https://tracker.ceph.com/issues/50656.
> > Adding Igor for more ideas.
> >
> > Neha
> >
> >>
> >> Now I m googling and learning but, Is there a way how to easily test
> >> lets say 15.2.XX version on osd without losing anything?
> >>
> >> Any help would be appreciated.
> >>
> >> Error start like this
> >>
> >> May 14 16:58:52 dragon-ball-radar systemd[1]: Starting Ceph osd.2 for
> >> 4e01640b-951b-4f75-8dca-0bad4faf1b11...
> >> May 14 16:58:53 dragon-ball-radar podman[113650]: 2021-05-14
> >> 16:58:53.057836433 +0000 UTC m=+0.454352919 container create
> >> 3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd
> >> (image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
> >> name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate,
> >> GIT_BRANCH=HEAD, maintainer=D
> >> May 14 16:58:53 dragon-ball-radar systemd[1]: Started libcrun
> >> container.
> >> May 14 16:58:53 dragon-ball-radar podman[113650]: 2021-05-14
> >> 16:58:53.3394116 +0000 UTC m=+0.735928098 container init
> >> 3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd
> >> (image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
> >> name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate,
> >> maintainer=Dimitri Savineau <dsav
> >> May 14 16:58:53 dragon-ball-radar podman[113650]: 2021-05-14
> >> 16:58:53.446921192 +0000 UTC m=+0.843437626 container start
> >> 3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd
> >> (image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
> >> name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate,
> >> GIT_BRANCH=HEAD, org.label-sch
> >> May 14 16:58:53 dragon-ball-radar podman[113650]: 2021-05-14
> >> 16:58:53.447050119 +0000 UTC m=+0.843566553 container attach
> >> 3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd
> >> (image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
> >> name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate,
> >> org.label-schema.name=CentOS
> >> May 14 16:58:53 dragon-ball-radar bash[113558]: Running command:
> >> /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-2
> >> May 14 16:58:53 dragon-ball-radar bash[113558]: Running command:
> >> /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev
> >> /dev/ceph-45e6ef2e-fbdc-4289-a900-3d1ffc81ee14/osd-block-973cfe73-06c8-4ea0-9aea-1361d063eb25
> >> --path /var/lib/ceph/osd/ceph-2 --no-mon-config
> >> May 14 16:58:53 dragon-ball-radar bash[113558]: Running command:
> >> /usr/bin/ln -snf
> >> /dev/ceph-45e6ef2e-fbdc-4289-a900-3d1ffc81ee14/osd-block-973cfe73-06c8-4ea0-9aea-1361d063eb25
> >> /var/lib/ceph/osd/ceph-2/block
> >> May 14 16:58:53 dragon-ball-radar bash[113558]: Running command:
> >> /usr/bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-2/block
> >> May 14 16:58:53 dragon-ball-radar bash[113558]: Running command:
> >> /usr/bin/chown -R ceph:ceph /dev/dm-1
> >> May 14 16:58:53 dragon-ball-radar bash[113558]: Running command:
> >> /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-2
> >> May 14 16:58:53 dragon-ball-radar bash[113558]: --> ceph-volume lvm
> >> activate successful for osd ID: 2
> >> May 14 16:58:53 dragon-ball-radar podman[113650]: 2021-05-14
> >> 16:58:53.8147653 +0000 UTC m=+1.211281741 container died
> >> 3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd
> >> (image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
> >> name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate)
> >> May 14 16:58:55 dragon-ball-radar podman[113650]: 2021-05-14
> >> 16:58:55.044964534 +0000 UTC m=+2.441480996 container remove
> >> 3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd
> >> (image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
> >> name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate,
> >> CEPH_POINT_RELEASE=-16.2.4, R
> >> May 14 16:58:55 dragon-ball-radar podman[113909]: 2021-05-14
> >> 16:58:55.594265612 +0000 UTC m=+0.369978347 container create
> >> 31364008fcb8b290643d6e892fba16d19618f5682f590373feabed23061749da
> >> (image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
> >> name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2, RELEASE=HEAD,
> >> org.label-schema.build-d
> >> May 14 16:58:55 dragon-ball-radar podman[113909]: 2021-05-14
> >> 16:58:55.864589286 +0000 UTC m=+0.640302021 container init
> >> 31364008fcb8b290643d6e892fba16d19618f5682f590373feabed23061749da
> >> (image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
> >> name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2,
> >> org.label-schema.schema-version=1.0, GIT
> >> May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
> >> 2021-05-14T16:58:55.896+0000 7fcf16aa2080 0 set uid:gid to 167:167
> >> (ceph:ceph)
> >> May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
> >> 2021-05-14T16:58:55.896+0000 7fcf16aa2080 0 ceph version 16.2.4
> >> (3cbe25cde3cfa028984618ad32de9edc4c1eaed0) pacific (stable), process
> >> ceph-osd, pid 2
> >> May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
> >> 2021-05-14T16:58:55.896+0000 7fcf16aa2080 0 pidfile_write: ignore
> >> empty
> >> --pid-file
> >> May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
> >> 2021-05-14T16:58:55.896+0000 7fcf16aa2080 1 bdev(0x564ad3a8c800
> >> /var/lib/ceph/osd/ceph-2/block) open path
> >> /var/lib/ceph/osd/ceph-2/block
> >> May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
> >> 2021-05-14T16:58:55.900+0000 7fcf16aa2080 1 bdev(0x564ad3a8c800
> >> /var/lib/ceph/osd/ceph-2/block) open size 500103643136 (0x7470800000,
> >> 466 GiB) block_size 4096 (4 KiB) non-rotational discard supported
> >> May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
> >> 2021-05-14T16:58:55.900+0000 7fcf16aa2080 1
> >> bluestore(/var/lib/ceph/osd/ceph-2) _set_cache_sizes cache_size
> >> 3221225472 meta 0.45 kv 0.45 data 0.06
> >> May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
> >> 2021-05-14T16:58:55.900+0000 7fcf16aa2080 1 bdev(0x564ad3a8cc00
> >> /var/lib/ceph/osd/ceph-2/block) open path
> >> /var/lib/ceph/osd/ceph-2/block
> >> May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
> >> 2021-05-14T16:58:55.900+0000 7fcf16aa2080 1 bdev(0x564ad3a8cc00
> >> /var/lib/ceph/osd/ceph-2/block) open size 500103643136 (0x7470800000,
> >> 466 GiB) block_size 4096 (4 KiB) non-rotational discard supported
> >> May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
> >> 2021-05-14T16:58:55.900+0000 7fcf16aa2080 1 bluefs add_block_device
> >> bdev
> >> 1 path /var/lib/ceph/osd/ceph-2/block size 466 GiB
> >> May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
> >> 2021-05-14T16:58:55.900+0000 7fcf16aa2080 1 bdev(0x564ad3a8cc00
> >> /var/lib/ceph/osd/ceph-2/block) close
> >> May 14 16:58:55 dragon-ball-radar podman[113909]: 2021-05-14
> >> 16:58:55.972267166 +0000 UTC m=+0.747979911 container start
> >> 31364008fcb8b290643d6e892fba16d19618f5682f590373feabed23061749da
> >> (image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
> >> name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2, ceph=True,
> >> GIT_REPO=https://github.com/
> >> May 14 16:58:55 dragon-ball-radar bash[113558]:
> >> 31364008fcb8b290643d6e892fba16d19618f5682f590373feabed23061749da
> >> May 14 16:58:55 dragon-ball-radar systemd[1]: Started Ceph osd.2 for
> >> 4e01640b-951b-4f75-8dca-0bad4faf1b11.
> >> May 14 16:58:56 dragon-ball-radar conmon[113957]: debug
> >> 2021-05-14T16:58:56.184+0000 7fcf16aa2080 1 bdev(0x564ad3a8c800
> >> /var/lib/ceph/osd/ceph-2/block) close
> >> May 14 16:58:56 dragon-ball-radar conmon[113957]: debug
> >> 2021-05-14T16:58:56.444+0000 7fcf16aa2080 1 objectstore numa_node 0
> >> May 14 16:58:56 dragon-ball-radar conmon[113957]: debug
> >> 2021-05-14T16:58:56.444+0000 7fcf16aa2080 0 starting osd.2 osd_data
> >> /var/lib/ceph/osd/ceph-2 /var/lib/ceph/osd/ceph-2/journal
> >> May 14 16:58:56 dragon-ball-radar conmon[113957]: debug
> >> 2021-05-14T16:58:56.444+0000 7fcf16aa2080 -1 unable to find any IPv4
> >> address in networks '10.0.199.0/24' interfaces ''
> >> May 14 16:58:56 dragon-ball-radar conmon[113957]: debug
> >> 2021-05-14T16:58:56.444+0000 7fcf16aa2080 -1 unable to find any IPv4
> >> address in networks '172.16.199.0/24' interfaces ''
> >> May 14 16:58:56 dragon-ball-radar conmon[113957]: debug
> >> 2021-05-14T16:58:56.452+0000 7fcf16aa2080 0 load: jerasure load: lrc
> >> load: isa
> >> May 14 16:58:56 dragon-ball-radar conmon[113957]: debug
> >> 2021-05-14T16:58:56.456+0000 7fcf16aa2080 1 bdev(0x564ad476e400
> >> /var/lib/ceph/osd/ceph-2/block) open path
> >> /var/lib/ceph/osd/ceph-2/block
> >> May 14 16:58:56 dragon-ball-radar conmon[113957]: debug
> >> 2021-05-14T16:58:56.456+0000 7fcf16aa2080 1 bdev(0x564ad476e400
> >> /var/lib/ceph/osd/ceph-2/block) open size 500103643136 (0x7470800000,
> >> 466 GiB) block_size 4096 (4 KiB) non-rotational discard supported
> >> May 14 16:58:56 dragon-ball-radar conmon[113957]: debug
> >> 2021-05-14T16:58:56.456+0000 7fcf16aa2080 1
> >> bluestore(/var/lib/ceph/osd/ceph-2) _set_cache_sizes cache_size
> >> 3221225472 meta 0.45 kv 0.45 data 0.06
> >> May 14 16:58:56 dragon-ball-radar conmon[113957]: debug
> >> 2021-05-14T16:58:56.456+0000 7fcf16aa2080 1 bdev(0x564ad476e400
> >> /var/lib/ceph/osd/ceph-2/block) close
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx
> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux