Re: after upgrade to 16.2.3 16.2.4 and after adding few hdd's OSD's started to fail 1 by 1.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This looks similar to #50656 indeed.

Hopefully will fix that next week.


Thanks,

Igor

On 5/14/2021 9:09 PM, Neha Ojha wrote:
On Fri, May 14, 2021 at 10:47 AM Andrius Jurkus
<andrius.jurkus@xxxxxxxxxx> wrote:
Hello, I will try to keep it sad and short :) :(    PS sorry if this
dublicate I tried post it from web also.

Today I upgraded from 16.2.3 to 16.2.4 and added few hosts and osds.
After data migration for few hours, 1 SSD failed, then another and
another 1 by 1. Now I have cluster in pause and 5 failed SSD's, same
host has SSD and HDD, but only SSD's are failing so I think this has to
be balancing refiling or something bug and probably not upgrade bug.

Cluster has been in pause for 4 hours and no more OSD's are failing.

full trace
https://pastebin.com/UxbfFYpb
This looks very similar to https://tracker.ceph.com/issues/50656.
Adding Igor for more ideas.

Neha

Now I m googling and learning but, Is there a way how to easily test
lets say 15.2.XX version on osd without losing anything?

Any help would be appreciated.

Error start like this

May 14 16:58:52 dragon-ball-radar systemd[1]: Starting Ceph osd.2 for
4e01640b-951b-4f75-8dca-0bad4faf1b11...
May 14 16:58:53 dragon-ball-radar podman[113650]: 2021-05-14
16:58:53.057836433 +0000 UTC m=+0.454352919 container create
3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd
(image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate,
GIT_BRANCH=HEAD, maintainer=D
May 14 16:58:53 dragon-ball-radar systemd[1]: Started libcrun container.
May 14 16:58:53 dragon-ball-radar podman[113650]: 2021-05-14
16:58:53.3394116 +0000 UTC m=+0.735928098 container init
3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd
(image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate,
maintainer=Dimitri Savineau <dsav
May 14 16:58:53 dragon-ball-radar podman[113650]: 2021-05-14
16:58:53.446921192 +0000 UTC m=+0.843437626 container start
3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd
(image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate,
GIT_BRANCH=HEAD, org.label-sch
May 14 16:58:53 dragon-ball-radar podman[113650]: 2021-05-14
16:58:53.447050119 +0000 UTC m=+0.843566553 container attach
3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd
(image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate,
org.label-schema.name=CentOS
May 14 16:58:53 dragon-ball-radar bash[113558]: Running command:
/usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-2
May 14 16:58:53 dragon-ball-radar bash[113558]: Running command:
/usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev
/dev/ceph-45e6ef2e-fbdc-4289-a900-3d1ffc81ee14/osd-block-973cfe73-06c8-4ea0-9aea-1361d063eb25
--path /var/lib/ceph/osd/ceph-2 --no-mon-config
May 14 16:58:53 dragon-ball-radar bash[113558]: Running command:
/usr/bin/ln -snf
/dev/ceph-45e6ef2e-fbdc-4289-a900-3d1ffc81ee14/osd-block-973cfe73-06c8-4ea0-9aea-1361d063eb25
/var/lib/ceph/osd/ceph-2/block
May 14 16:58:53 dragon-ball-radar bash[113558]: Running command:
/usr/bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-2/block
May 14 16:58:53 dragon-ball-radar bash[113558]: Running command:
/usr/bin/chown -R ceph:ceph /dev/dm-1
May 14 16:58:53 dragon-ball-radar bash[113558]: Running command:
/usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-2
May 14 16:58:53 dragon-ball-radar bash[113558]: --> ceph-volume lvm
activate successful for osd ID: 2
May 14 16:58:53 dragon-ball-radar podman[113650]: 2021-05-14
16:58:53.8147653 +0000 UTC m=+1.211281741 container died
3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd
(image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate)
May 14 16:58:55 dragon-ball-radar podman[113650]: 2021-05-14
16:58:55.044964534 +0000 UTC m=+2.441480996 container remove
3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd
(image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate,
CEPH_POINT_RELEASE=-16.2.4, R
May 14 16:58:55 dragon-ball-radar podman[113909]: 2021-05-14
16:58:55.594265612 +0000 UTC m=+0.369978347 container create
31364008fcb8b290643d6e892fba16d19618f5682f590373feabed23061749da
(image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2, RELEASE=HEAD,
org.label-schema.build-d
May 14 16:58:55 dragon-ball-radar podman[113909]: 2021-05-14
16:58:55.864589286 +0000 UTC m=+0.640302021 container init
31364008fcb8b290643d6e892fba16d19618f5682f590373feabed23061749da
(image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2,
org.label-schema.schema-version=1.0, GIT
May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:55.896+0000 7fcf16aa2080 0 set uid:gid to 167:167
(ceph:ceph)
May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:55.896+0000 7fcf16aa2080 0 ceph version 16.2.4
(3cbe25cde3cfa028984618ad32de9edc4c1eaed0) pacific (stable), process
ceph-osd, pid 2
May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:55.896+0000 7fcf16aa2080 0 pidfile_write: ignore empty
--pid-file
May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:55.896+0000 7fcf16aa2080 1 bdev(0x564ad3a8c800
/var/lib/ceph/osd/ceph-2/block) open path /var/lib/ceph/osd/ceph-2/block
May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:55.900+0000 7fcf16aa2080 1 bdev(0x564ad3a8c800
/var/lib/ceph/osd/ceph-2/block) open size 500103643136 (0x7470800000,
466 GiB) block_size 4096 (4 KiB) non-rotational discard supported
May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:55.900+0000 7fcf16aa2080 1
bluestore(/var/lib/ceph/osd/ceph-2) _set_cache_sizes cache_size
3221225472 meta 0.45 kv 0.45 data 0.06
May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:55.900+0000 7fcf16aa2080 1 bdev(0x564ad3a8cc00
/var/lib/ceph/osd/ceph-2/block) open path /var/lib/ceph/osd/ceph-2/block
May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:55.900+0000 7fcf16aa2080 1 bdev(0x564ad3a8cc00
/var/lib/ceph/osd/ceph-2/block) open size 500103643136 (0x7470800000,
466 GiB) block_size 4096 (4 KiB) non-rotational discard supported
May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:55.900+0000 7fcf16aa2080 1 bluefs add_block_device bdev
1 path /var/lib/ceph/osd/ceph-2/block size 466 GiB
May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:55.900+0000 7fcf16aa2080 1 bdev(0x564ad3a8cc00
/var/lib/ceph/osd/ceph-2/block) close
May 14 16:58:55 dragon-ball-radar podman[113909]: 2021-05-14
16:58:55.972267166 +0000 UTC m=+0.747979911 container start
31364008fcb8b290643d6e892fba16d19618f5682f590373feabed23061749da
(image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2, ceph=True,
GIT_REPO=https://github.com/
May 14 16:58:55 dragon-ball-radar bash[113558]:
31364008fcb8b290643d6e892fba16d19618f5682f590373feabed23061749da
May 14 16:58:55 dragon-ball-radar systemd[1]: Started Ceph osd.2 for
4e01640b-951b-4f75-8dca-0bad4faf1b11.
May 14 16:58:56 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:56.184+0000 7fcf16aa2080 1 bdev(0x564ad3a8c800
/var/lib/ceph/osd/ceph-2/block) close
May 14 16:58:56 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:56.444+0000 7fcf16aa2080 1 objectstore numa_node 0
May 14 16:58:56 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:56.444+0000 7fcf16aa2080 0 starting osd.2 osd_data
/var/lib/ceph/osd/ceph-2 /var/lib/ceph/osd/ceph-2/journal
May 14 16:58:56 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:56.444+0000 7fcf16aa2080 -1 unable to find any IPv4
address in networks '10.0.199.0/24' interfaces ''
May 14 16:58:56 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:56.444+0000 7fcf16aa2080 -1 unable to find any IPv4
address in networks '172.16.199.0/24' interfaces ''
May 14 16:58:56 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:56.452+0000 7fcf16aa2080 0 load: jerasure load: lrc
load: isa
May 14 16:58:56 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:56.456+0000 7fcf16aa2080 1 bdev(0x564ad476e400
/var/lib/ceph/osd/ceph-2/block) open path /var/lib/ceph/osd/ceph-2/block
May 14 16:58:56 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:56.456+0000 7fcf16aa2080 1 bdev(0x564ad476e400
/var/lib/ceph/osd/ceph-2/block) open size 500103643136 (0x7470800000,
466 GiB) block_size 4096 (4 KiB) non-rotational discard supported
May 14 16:58:56 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:56.456+0000 7fcf16aa2080 1
bluestore(/var/lib/ceph/osd/ceph-2) _set_cache_sizes cache_size
3221225472 meta 0.45 kv 0.45 data 0.06
May 14 16:58:56 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:56.456+0000 7fcf16aa2080 1 bdev(0x564ad476e400
/var/lib/ceph/osd/ceph-2/block) close
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux