Re: OSD fails to start after 17.2.6 to 17.2.7 update

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I just discovered that rook is tracking this here:
https://github.com/rook/rook/issues/13136

On Tue, 7 Nov 2023 at 18:09, Matthew Booth <mbooth@xxxxxxxxxx> wrote:

> On Tue, 7 Nov 2023 at 16:26, Matthew Booth <mbooth@xxxxxxxxxx> wrote:
>
>> FYI I left rook as is and reverted to ceph 17.2.6 and the issue is
>> resolved.
>>
>> The code change was added by
>> commit 2e52c029bc2b052bb96f4731c6bb00e30ed209be:
>>     ceph-volume: fix broken workaround for atari partitions
>>
>>     broken by bea9f4b643ce32268ad79c0fc257b25ff2f8333c
>>     This commits fixes that regression.
>>
>>     Fixes: https://tracker.ceph.com/issues/62001
>>
>>     Signed-off-by: Guillaume Abrioux <gabrioux@xxxxxxx>
>>     (cherry picked from commit b3fd5b513176fb9ba1e6e0595ded4b41d401c68e)
>>
>> It feels like a regression to me.
>>
>
> It looks like the issue is that the argument passed on List.generate is
> '/dev/sdc', but lsblk's NAME field contains 'sdc'. The NAME field was not
> used this way in v17.2.6.
>
> I haven't checked, but I assume that ceph-bluestore-tool can accept either
> 'sdc' or '/dev/sdc'.
>
> Matt
>
>
>>
>> Matt
>>
>> On Tue, 7 Nov 2023 at 16:13, Matthew Booth <mbooth@xxxxxxxxxx> wrote:
>>
>>> Firstly I'm rolling out a rook update from v1.12.2 to v1.12.7 (latest
>>> stable) and ceph from 17.2.6 to 17.2.7 at the same time. I mention this in
>>> case the problem is actually caused by rook rather than ceph. It looks like
>>> ceph to my uninitiated eyes, though.
>>>
>>> The update just started bumping my OSDs and the first one fails in the
>>> 'activate' init container. The complete logs for this container are:
>>>
>>> + OSD_ID=5
>>> + CEPH_FSID=<redacted>
>>> + OSD_UUID=<redacted>
>>> + OSD_STORE_FLAG=--bluestore
>>> + OSD_DATA_DIR=/var/lib/ceph/osd/ceph-5
>>> + CV_MODE=raw
>>> + DEVICE=/dev/sdc
>>> + cp --no-preserve=mode /etc/temp-ceph/ceph.conf /etc/ceph/ceph.conf
>>> + python3 -c '
>>> import configparser
>>>
>>> config = configparser.ConfigParser()
>>> config.read('\''/etc/ceph/ceph.conf'\'')
>>>
>>> if not config.has_section('\''global'\''):
>>>     config['\''global'\''] = {}
>>>
>>> if not config.has_option('\''global'\'','\''fsid'\''):
>>>     config['\''global'\'']['\''fsid'\''] = '\''<redacted>'\''
>>>
>>> with open('\''/etc/ceph/ceph.conf'\'', '\''w'\'') as configfile:
>>>     config.write(configfile)
>>> '
>>> + ceph -n client.admin auth get-or-create osd.5 mon 'allow profile osd'
>>> mgr 'allow profile osd' osd 'allow *' -k
>>> /etc/ceph/admin-keyring-store/keyring
>>> [osd.5]
>>>         key = <redacted>
>>> + [[ raw == \l\v\m ]]
>>> ++ mktemp
>>> + OSD_LIST=/tmp/tmp.CekJVsr9gr
>>> + ceph-volume raw list /dev/sdc
>>> Traceback (most recent call last):
>>>   File "/usr/sbin/ceph-volume", line 11, in <module>
>>>     load_entry_point('ceph-volume==1.0.0', 'console_scripts',
>>> 'ceph-volume')()
>>>   File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 41,
>>> in __init__
>>>     self.main(self.argv)
>>>   File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py",
>>> line 59, in newfunc
>>>     return f(*a, **kw)
>>>   File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 153,
>>> in main
>>>     terminal.dispatch(self.mapper, subcommand_args)
>>>   File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line
>>> 194, in dispatch
>>>     instance.main()
>>>   File
>>> "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/main.py", line
>>> 32, in main
>>>     terminal.dispatch(self.mapper, self.argv)
>>>   File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line
>>> 194, in dispatch
>>>     instance.main()
>>>   File
>>> "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/list.py", line
>>> 166, in main
>>>     self.list(args)
>>>   File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py",
>>> line 16, in is_root
>>>     return func(*a, **kw)
>>>   File
>>> "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/list.py", line
>>> 122, in list
>>>     report = self.generate(args.device)
>>>   File
>>> "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/list.py", line
>>> 91, in generate
>>>     info_device = [info for info in info_devices if info['NAME'] ==
>>> dev][0]
>>> IndexError: list index out of range
>>>
>>> So it has failed executing `ceph-volume raw list /dev/sdc`.
>>>
>>> It looks like this code is new in 17.2.7. Is this a regression? What
>>> would be the simplest way to back out of it?
>>>
>>> Thanks,
>>> Matt
>>> --
>>> Matthew Booth
>>>
>>
>>
>> --
>> Matthew Booth
>>
>
>
> --
> Matthew Booth
>


-- 
Matthew Booth
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux