Hello Mara, Thank you so much, you are a lifesaver! I'm not very skilled at docker, normally just use docker containers with provided docker run commands. So it took some time before I was able to run the command inside the container, and have the container access the ceph osd disk. But after some trail and error I managed to fix everything and now my cluster is healthy again! Again, thank you! I also want to take the opportunity to thank everyone else in the ceph community for a great project! Best regards Stefan Lissmats Sent with Proton Mail secure email. ------- Original Message ------- On Monday, June 13th, 2022 at 4:42 PM, Mara Sophie Grosch <littlefox@xxxxxxxxxx> wrote: > Hi, > > as someone who has gone through that just last week, that sounds a lot > like the symptoms of my cluster. In case you are comfortable with docker > (or any other container runtime), I have pushed an image [1] with quincy > from a few days ago, the fix for pglog dups being included in that and > was able to successfully clean my OSD with the ceph-objectstore-tool in > it. > > Something like `CEPH_ARGS="--osd_pg_log_trim_max=50000 --osd_max_pg_log_entries=2000 ceph-objectstore-tool --data-path $osd_path --op trim-pg-log` should help (command mostly from memory, > check it before executing it - as always). > > Best of luck, Mara > > [1] littlefox/ceph-daemon-base:2, based on commit 5d47b8e21e77a57e51781f00021f77c7967ebbe2 > > Am Mon, Jun 13, 2022 at 02:10:42PM +0000 schrieb Stefan: > > > Hello, > > > > I have been running Ceph for several years and everything has been rock solid until this weekend. > > Due to some unfortune events my cluster at home is down. > > > > I have two osd:s that don't boot and the reason seems to be this issue: https://tracker.ceph.com/issues/53729 > > > > I'm currently running version 17.2.0, but when i hit the issue I was on 16.2.7. In an attempt to fix the issue i upgraded first to 16.2.9 and then to 17.2.0, but it didn't help. > > I also tried giving it a huge swap. But it ended up krashing anyway. > > > > 1. There seems to be a fix for the issue in a github branch. https://github.com/NitzanMordhai/ceph/tree/wip-nitzan-pglog-dups-not-trimmed/ I don't have very advanced Ceph/Linux skills and i'm not 100% that i understand exacly how I should use it. > > Do I need to compile a complete Ceph installation and run that or can i pinpoint ceph-objectstore-tool in some way to only compile and run that? > > 2. The issue seems to be targeted for release in 17.2.1, is there any information when that will be released? > > > > Any advice would be very welcome since i was running a lot of different VM:s and didn't have all backed up. > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx