Re: OSD crash with assertion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ouch , ok .

-----Message d'origine-----
De : Michael Fladischer <michael@xxxxxxxx> 
Envoyé : 22 juin 2020 15:57
À : St-Germain, Sylvain (SSC/SPC) <sylvain.st-germain@xxxxxxxxx>; ceph-users@xxxxxxx
Objet : Re:  Re: OSD crash with assertion

Hi Sylvain,

Yeah, that's the best and safes way to do it. The pool I wrecked was fortunately a dummy-pool.

The pool for which I want to change to EC profile is ~4PiB large, so moving all files (pool is used in CephFS) on it to a new pool might take some time and I was hoping for an in-place configuration change. But as demonstrated by my own recklessness, this does not work and will take most of the OSD down with it.

Regards,
Michael

Am 22.06.2020 um 21:39 schrieb St-Germain, Sylvain (SSC/SPC):
> The way I did is I create a new pool, copy data on it and put the new 
> pool in place of the old one after I delete the former pool
> 
> echo "--------------------------------------------------------------------"
> echo " Create a new pool with erasure coding"
> echo "--------------------------------------------------------------------"
> sudo ceph osd pool create $pool.new 64 64 erasure ecprofile-5-3
> 
> echo "--------------------------------------------------------------------"
> echo " Copy the original pool to the new pool"
> echo "--------------------------------------------------------------------"
> sudo rados cppool $pool $pool.new
> 
> echo "--------------------------------------------------------------------"
> echo " Rename the original pool to .old"
> echo "--------------------------------------------------------------------"
> sudo ceph osd pool rename $pool $pool.old
> 
> echo "--------------------------------------------------------------------"
> echo " Rename the new erasure coding pool to $pool"
> echo "--------------------------------------------------------------------"
> sudo ceph osd pool rename $pool.new $pool
> 
> echo "--------------------------------------------------------------------"
> echo " Set the pool: $pool  to autoscaling"
> echo "--------------------------------------------------------------------"
> sudo ceph osd pool set $pool pg_autoscale_mode on
> 
> echo "--------------------------------------------------------------------"
> echo " Show detail off the new create pool"
> echo "--------------------------------------------------------------------"
> sudo ceph osd pool get $pool all
> 
> Sylvain
> 
> -----Message d'origine-----
> De : Michael Fladischer <michael@xxxxxxxx> Envoyé : 22 juin 2020 15:23 
> À : ceph-users@xxxxxxx Objet :  Re: OSD crash with 
> assertion
> 
> Turns out, I really messed up when changing the EC profile. Removing the pool did not get rid of it's PGs on the OSDs that have crashed.
> 
> To get my OSDs back up I used ceph-objectstore-tool like this:
> 
> for PG in $(ceph-objectstore-tool --data-path $DIR --type=bluestore --op=list-pgs |grep '^$POOL_ID'); do
> 	ceph-objectstore-tool --data-path $DIR --type=bluestore --op=remove 
> --force --pgid=$PG done
> 
> $DIR is the data path of the crashed OSD.
> $POOL_ID is the ID of the pool with the messed up EC profile.
> 
> I'm now curious if there is an easier way to do this?
> 
> After getting rid of all PGs the OSD were able to start again. Hope this helps someone.
> 
> Regards,
> Michael
> 
> 
> Am 22.06.2020 um 19:46 schrieb Michael Fladischer:
>> Hi,
>>
>> a lot of our OSD have crashed a few hours ago because of a failed
>> assertion:
>>
>> /build/ceph-15.2.3/src/osd/ECUtil.h: 34: FAILED 
>> ceph_assert(stripe_width % stripe_size == 0)
>>
>> Full output here:
>> https://pastebin.com/D1SXzKsK
>>
>> All OSDs are on bluestore and run 15.2.3.
>>
>> I think I messed up when I tried to change an existing EC profile 
>> (using
>> --force) for an active EC pool.
>>
>> I already tried to delete the pool and the EC profile and start the 
>> OSDs but they keep crashing with the same assertion.
>>
>> Is there a way to at least find out what the values are for 
>> stripe_width and stripe_size?
>>
>> Regards,
>> Michael
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an 
>> email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an 
> email to ceph-users-leave@xxxxxxx 
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an 
> email to ceph-users-leave@xxxxxxx
> 
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux