Re: OSD fails to start after 17.2.6 to 17.2.7 update

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 7 Nov 2023 at 16:26, Matthew Booth <mbooth@xxxxxxxxxx> wrote:

> FYI I left rook as is and reverted to ceph 17.2.6 and the issue is
> resolved.
>
> The code change was added by
> commit 2e52c029bc2b052bb96f4731c6bb00e30ed209be:
>     ceph-volume: fix broken workaround for atari partitions
>
>     broken by bea9f4b643ce32268ad79c0fc257b25ff2f8333c
>     This commits fixes that regression.
>
>     Fixes: https://tracker.ceph.com/issues/62001
>
>     Signed-off-by: Guillaume Abrioux <gabrioux@xxxxxxx>
>     (cherry picked from commit b3fd5b513176fb9ba1e6e0595ded4b41d401c68e)
>
> It feels like a regression to me.
>

It looks like the issue is that the argument passed on List.generate is
'/dev/sdc', but lsblk's NAME field contains 'sdc'. The NAME field was not
used this way in v17.2.6.

I haven't checked, but I assume that ceph-bluestore-tool can accept either
'sdc' or '/dev/sdc'.

Matt


>
> Matt
>
> On Tue, 7 Nov 2023 at 16:13, Matthew Booth <mbooth@xxxxxxxxxx> wrote:
>
>> Firstly I'm rolling out a rook update from v1.12.2 to v1.12.7 (latest
>> stable) and ceph from 17.2.6 to 17.2.7 at the same time. I mention this in
>> case the problem is actually caused by rook rather than ceph. It looks like
>> ceph to my uninitiated eyes, though.
>>
>> The update just started bumping my OSDs and the first one fails in the
>> 'activate' init container. The complete logs for this container are:
>>
>> + OSD_ID=5
>> + CEPH_FSID=<redacted>
>> + OSD_UUID=<redacted>
>> + OSD_STORE_FLAG=--bluestore
>> + OSD_DATA_DIR=/var/lib/ceph/osd/ceph-5
>> + CV_MODE=raw
>> + DEVICE=/dev/sdc
>> + cp --no-preserve=mode /etc/temp-ceph/ceph.conf /etc/ceph/ceph.conf
>> + python3 -c '
>> import configparser
>>
>> config = configparser.ConfigParser()
>> config.read('\''/etc/ceph/ceph.conf'\'')
>>
>> if not config.has_section('\''global'\''):
>>     config['\''global'\''] = {}
>>
>> if not config.has_option('\''global'\'','\''fsid'\''):
>>     config['\''global'\'']['\''fsid'\''] = '\''<redacted>'\''
>>
>> with open('\''/etc/ceph/ceph.conf'\'', '\''w'\'') as configfile:
>>     config.write(configfile)
>> '
>> + ceph -n client.admin auth get-or-create osd.5 mon 'allow profile osd'
>> mgr 'allow profile osd' osd 'allow *' -k
>> /etc/ceph/admin-keyring-store/keyring
>> [osd.5]
>>         key = <redacted>
>> + [[ raw == \l\v\m ]]
>> ++ mktemp
>> + OSD_LIST=/tmp/tmp.CekJVsr9gr
>> + ceph-volume raw list /dev/sdc
>> Traceback (most recent call last):
>>   File "/usr/sbin/ceph-volume", line 11, in <module>
>>     load_entry_point('ceph-volume==1.0.0', 'console_scripts',
>> 'ceph-volume')()
>>   File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 41,
>> in __init__
>>     self.main(self.argv)
>>   File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line
>> 59, in newfunc
>>     return f(*a, **kw)
>>   File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 153,
>> in main
>>     terminal.dispatch(self.mapper, subcommand_args)
>>   File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line
>> 194, in dispatch
>>     instance.main()
>>   File
>> "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/main.py", line
>> 32, in main
>>     terminal.dispatch(self.mapper, self.argv)
>>   File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line
>> 194, in dispatch
>>     instance.main()
>>   File
>> "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/list.py", line
>> 166, in main
>>     self.list(args)
>>   File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line
>> 16, in is_root
>>     return func(*a, **kw)
>>   File
>> "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/list.py", line
>> 122, in list
>>     report = self.generate(args.device)
>>   File
>> "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/list.py", line
>> 91, in generate
>>     info_device = [info for info in info_devices if info['NAME'] ==
>> dev][0]
>> IndexError: list index out of range
>>
>> So it has failed executing `ceph-volume raw list /dev/sdc`.
>>
>> It looks like this code is new in 17.2.7. Is this a regression? What
>> would be the simplest way to back out of it?
>>
>> Thanks,
>> Matt
>> --
>> Matthew Booth
>>
>
>
> --
> Matthew Booth
>


-- 
Matthew Booth
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux