Re: Probable bug when raplacing osd disk with smaller one

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



+dev@xxxxxxx  -ceph-devel@xxxxxxxxxxxxxxx

On Thu, Sep 5, 2019 at 8:56 PM Ugis <ugis22@xxxxxxxxx> wrote:
>
> Hi,
>
> ceph version 14.2.1 (d555a9489eb35f84f2e1ef49b77e19da9d113972) nautilus (stable)
>
> Yesterday noticed unexpected behavior, probably bug. It seems ceph
> wrongly calculates osd size if it is replaced with smaller disk.
>
> In detail:
> Starting point: 1 osd disk had failed, ceph had reballanced and osd
> was marked down.
>
> I did remove failed disk(10TB) and replaced with smaller 6TB.
> Followed disk replacement instructions here:
> https://docs.ceph.com/docs/mimic/rados/operations/add-or-rm-osds/
>
> Destroy the OSD first:
>   ceph osd destroy {id} --yes-i-really-mean-it
> Zap a disk for the new OSD, if the disk was used before for other
> purposes. It’s not necessary for a new disk:
>   ceph-volume lvm zap /dev/sdX
> Prepare the disk for replacement by using the previously destroyed OSD id:
>  ceph-volume lvm  prepare --osd-id {id} --data /dev/sdX
> And activate the OSD:
>  ceph-volume lvm activate {id} {fsid}
>  I skipped this as was not clear what fsid was needed(probably ceph
> cluster fsid} and just started osd
>  systemctl start ceph-osd@29
>
> OSD came up and reballance started.
>
> After some time ceph started to complain following:
> # ceph health detail
> HEALTH_WARN 1 nearfull osd(s); 19 pool(s) nearfull; 10 pgs not
> deep-scrubbed in time
> OSD_NEARFULL 1 nearfull osd(s)
>     osd.29 is near full
>
> #ceph osd df tree
> --------------------
> ID  CLASS WEIGHT    REWEIGHT SIZE    RAW USE DATA    OMAP    META
> AVAIL   %USE  VAR  PGS  STATUS TYPE NAME
> ...
> 29   hdd   9.09569  1.00000 5.5 TiB 3.3 TiB 3.3 TiB 981 KiB  4.9 GiB
> 2.2 TiB 59.75 0.99  590     up         osd.29
>
> Later I noticed that weight of osd.29 was still 9.09569 as for
> replaced 10TB disk.
> I did: ceph osd crush reweight osd.29 5.45789
> Things got back to normal after reballance.
>
> Got impression that ceph did not realize that osd had been replaced
> with smaller disk. Could that be because I skipped activation step? Or
> this is a bug.
>
> Best regards,
> Ugis



-- 
Cheers,
Brad
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx




[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux