Re: is it possible to remove the db+wal from an external device (nvme)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yes. And “cephadm shell” command does not depend on the running daemon, it will start a new container. So I think it is perfectly fine to stop the OSD first then run the “cephadm shell” command, and run ceph-volume in the new shell.

发件人: Eugen Block<mailto:eblock@xxxxxx>
发送时间: 2021年9月29日 21:40
收件人: 胡 玮文<mailto:huww98@xxxxxxxxxxx>
抄送: Igor Fedotov<mailto:ifedotov@xxxxxxx>; Szabo, Istvan (Agoda)<mailto:Istvan.Szabo@xxxxxxxxx>; ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
主题: Re: is it possible to remove the db+wal from an external device (nvme)

The OSD has to be stopped in order to migrate DB/WAL, it can't be done
live. ceph-volume requires a lock on the device.


Zitat von 胡 玮文 <huww98@xxxxxxxxxxx>:

> I’ve not tried it, but how about:
>
> cephadm shell -n osd.0
>
> then run “ceph-volume” commands in the newly opened shell. The
> directory structure seems fine.
>
> $ sudo cephadm shell -n osd.0
> Inferring fsid e88d509a-f6fc-11ea-b25d-a0423f3ac864
> Inferring config
> /var/lib/ceph/e88d509a-f6fc-11ea-b25d-a0423f3ac864/osd.0/config
> Using recent ceph image
> cr.example.com/infra/ceph@sha256:8a0f6f285edcd6488e2c91d3f9fa43534d37d7a9b37db1e0ff6691aae6466530
> root@host0:/# ll /var/lib/ceph/osd/ceph-0/
> total 68
> drwx------ 2 ceph ceph 4096 Sep 20 04:15 ./
> drwxr-x--- 1 ceph ceph 4096 Sep 29 13:32 ../
> lrwxrwxrwx 1 ceph ceph   24 Sep 20 04:15 block -> /dev/ceph-hdd/osd.0.data
> lrwxrwxrwx 1 ceph ceph   23 Sep 20 04:15 block.db -> /dev/ubuntu-vg/osd.0.db
> -rw------- 1 ceph ceph   37 Sep 20 04:15 ceph_fsid
> -rw------- 1 ceph ceph  387 Jun 21 13:24 config
> -rw------- 1 ceph ceph   37 Sep 20 04:15 fsid
> -rw------- 1 ceph ceph   55 Sep 20 04:15 keyring
> -rw------- 1 ceph ceph    6 Sep 20 04:15 ready
> -rw------- 1 ceph ceph    3 Apr  2 01:46 require_osd_release
> -rw------- 1 ceph ceph   10 Sep 20 04:15 type
> -rw------- 1 ceph ceph   38 Sep 17 14:26 unit.configured
> -rw------- 1 ceph ceph   48 Nov  9  2020 unit.created
> -rw------- 1 ceph ceph   35 Sep 17 14:26 unit.image
> -rw------- 1 ceph ceph  306 Sep 17 14:26 unit.meta
> -rw------- 1 ceph ceph 1317 Sep 17 14:26 unit.poststop
> -rw------- 1 ceph ceph 3021 Sep 17 14:26 unit.run
> -rw------- 1 ceph ceph  142 Sep 17 14:26 unit.stop
> -rw------- 1 ceph ceph    2 Sep 20 04:15 whoami
>
> 发件人: Eugen Block<mailto:eblock@xxxxxx>
> 发送时间: 2021年9月29日 21:29
> 收件人: Igor Fedotov<mailto:ifedotov@xxxxxxx>
> 抄送: 胡 玮文<mailto:huww98@xxxxxxxxxxx>; Szabo, Istvan
> (Agoda)<mailto:Istvan.Szabo@xxxxxxxxx>;
> ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
> 主题: Re:  Re: 回复: [ceph-users] Re: is it possible to
> remove the db+wal from an external device (nvme)
>
> Hi Igor,
>
> thanks for your input. I haven't done this in a prod env yet either,
> still playing around in a virtual lab env.
> I tried the symlink suggestion but it's not that easy, because it
> looks different underneath the ceph directory than ceph-volume expects
> it. These are the services underneath:
>
> ses7-host1:~ # ll /var/lib/ceph/152fd738-01bc-11ec-a7fd-fa163e672db2/
> insgesamt 48
> drwx------ 3 root       root   4096 16. Sep 16:11 alertmanager.ses7-host1
> drwx------ 3 ceph       ceph   4096 29. Sep 09:03 crash
> drwx------ 2 ceph       ceph   4096 16. Sep 16:39 crash.ses7-host1
> drwx------ 4 messagebus lp     4096 16. Sep 16:23 grafana.ses7-host1
> drw-rw---- 2 root       root   4096 24. Aug 10:00 home
> drwx------ 2 ceph       ceph   4096 16. Sep 16:37 mgr.ses7-host1.wmgyit
> drwx------ 3 ceph       ceph   4096 16. Sep 16:37 mon.ses7-host1
> drwx------ 2 nobody     nobody 4096 16. Sep 16:37 node-exporter.ses7-host1
> drwx------ 2 ceph       ceph   4096 29. Sep 08:43 osd.0
> drwx------ 2 ceph       ceph   4096 29. Sep 15:11 osd.1
> drwx------ 4 root       root   4096 16. Sep 16:12 prometheus.ses7-host1
>
>
> While the directory in a non-containerized deployment looks like this:
>
> nautilus:~ # ll /var/lib/ceph/osd/ceph-0/
> insgesamt 24
> lrwxrwxrwx 1 ceph ceph 93 29. Sep 12:21 block ->
> /dev/ceph-a6d78a29-637f-494b-a839-76251fcff67e/osd-block-39340a48-54b3-4689-9896-f54d005c535d
> -rw------- 1 ceph ceph 37 29. Sep 12:21 ceph_fsid
> -rw------- 1 ceph ceph 37 29. Sep 12:21 fsid
> -rw------- 1 ceph ceph 55 29. Sep 12:21 keyring
> -rw------- 1 ceph ceph  6 29. Sep 12:21 ready
> -rw------- 1 ceph ceph 10 29. Sep 12:21 type
> -rw------- 1 ceph ceph  2 29. Sep 12:21 whoami
>
>
> But even if I create the symlink to the osd directory it fails because
> I only have ceph-volume within the containers where the symlink is not
> visible to cephadm.
>
>
> ses7-host1:~ # ll /var/lib/ceph/osd/ceph-1
> lrwxrwxrwx 1 root root 57 29. Sep 15:08 /var/lib/ceph/osd/ceph-1 ->
> /var/lib/ceph/152fd738-01bc-11ec-a7fd-fa163e672db2/osd.1/
>
> ses7-host1:~ # cephadm ceph-volume lvm migrate --osd-id 1 --osd-fsid
> b4c772aa-07f8-483d-ae58-0ab97b8d0cc4 --from db --target
> ceph-b1ddff4b-95e8-4b91-b451-a3ea35d16ec0/osd-block-b4c772aa-07f8-483d-ae58-0ab97b8d0cc4
> Inferring fsid 152fd738-01bc-11ec-a7fd-fa163e672db2
> [...]
> /usr/bin/podman: stderr --> Migrate to existing, Source:
> ['--devs-source', '/var/lib/ceph/osd/ceph-1/block.db'] Target:
> /var/lib/ceph/osd/ceph-1/block
> /usr/bin/podman: stderr  stdout: inferring bluefs devices from bluestore path
> /usr/bin/podman: stderr  stderr: can't migrate
> /var/lib/ceph/osd/ceph-1/block.db, not a valid bluefs volume
> /usr/bin/podman: stderr --> Failed to migrate device, error code:1
> /usr/bin/podman: stderr --> Undoing lv tag set
> /usr/bin/podman: stderr Failed to migrate to :
> ceph-b1ddff4b-95e8-4b91-b451-a3ea35d16ec0/osd-block-b4c772aa-07f8-483d-ae58-0ab97b8d0cc4
> Traceback (most recent call last):
>    File "/usr/sbin/cephadm", line 6225, in <module>
>      r = args.func()
>    File "/usr/sbin/cephadm", line 1363, in _infer_fsid
>      return func()
>    File "/usr/sbin/cephadm", line 1422, in _infer_image
>      return func()
>    File "/usr/sbin/cephadm", line 3687, in command_ceph_volume
>      out, err, code = call_throws(c.run_cmd(),
> verbosity=CallVerbosity.VERBOSE)
>    File "/usr/sbin/cephadm", line 1101, in call_throws
>      raise RuntimeError('Failed command: %s' % ' '.join(command))
> [...]
>
>
> I could install the package ceph-osd (where ceph-volume is packaged
> in) but it's not available by default (as you see this is a SES 7
> environment).
>
> I'm not sure what the design is here, it feels like the ceph-volume
> migrate command is not applicable to containers yet.
>
> Regards,
> Eugen
>
>
> Zitat von Igor Fedotov <ifedotov@xxxxxxx>:
>
>> Hi Eugen,
>>
>> indeed this looks like an issue related to containerized deployment,
>> "ceph-volume lvm migrate" expects osd folder to be under
>> /var/lib/ceph/osd:
>>
>>> stderr: 2021-09-29T06:56:24.787+0000 7fde05b96180 -1
>>> bluestore(/var/lib/ceph/osd/ceph-1) _lock_fsid failed to lock
>>> /var/lib/ceph/osd/ceph-1/fsid (is another ceph-osd still
>>> running?)(11) Resource temporarily unavailable
>>
>> As a workaround you might want to try to create a symlink to your
>> actual location before issuing the migrate command:
>> /var/lib/ceph/osd ->
>> /var/lib/ceph/152fd738-01bc-11ec-a7fd-fa163e672db2/
>>
>> More complicated (and more general IMO) way would be to run the
>> migrate command from within a container deployed similarly (i.e.
>> with all the proper subfolder mappings) to ceph-osd one. Just
>> speculating - not a big expert in containers and never tried that
>> with properly deployed production cluster...
>>
>>
>> Thanks,
>>
>> Igor
>>
>> On 9/29/2021 10:07 AM, Eugen Block wrote:
>>> Hi,
>>>
>>> I just tried with 'ceph-volume lvm migrate' in Octopus but it
>>> doesn't really work. I'm not sure if I'm missing something here,
>>> but I believe it's again the already discussed containers issue. To
>>> be able to run the command for an OSD the OSD has to be offline,
>>> but then you don't have access to the block.db because the path is
>>> different from outside the container:
>>>
>>> ---snip---
>>> [ceph: root@host1 /]# ceph-volume lvm migrate --osd-id 1 --osd-fsid
>>> b4c772aa-07f8-483d-ae58-0ab97b8d0cc4 --from db --target
>>> ceph-b1ddff4b-95e8-4b91-b451-a3ea35d16ec0/osd-block-b4c772aa-07f8-483d-ae58-0ab97b8d0cc4 --> Migrate to existing, Source: ['--devs-source', '/var/lib/ceph/osd/ceph-1/block.db']
>>> Target:
>>> /var/lib/ceph/osd/ceph-1/block
>>>  stdout: inferring bluefs devices from bluestore path
>>>  stderr:
>>> /home/abuild/rpmbuild/BUILD/ceph-15.2.14-84-gb6e5642e260/src/os/bluestore/BlueStore.cc: In function 'int BlueStore::_mount_for_bluefs()' thread 7fde05b96180
>>> time
>>> 2021-09-29T06:56:24.790161+0000
>>>  stderr:
>>> /home/abuild/rpmbuild/BUILD/ceph-15.2.14-84-gb6e5642e260/src/os/bluestore/BlueStore.cc: 6876: FAILED ceph_assert(r
>>> ==
>>> 0)
>>>  stderr: 2021-09-29T06:56:24.787+0000 7fde05b96180 -1
>>> bluestore(/var/lib/ceph/osd/ceph-1) _lock_fsid failed to lock
>>> /var/lib/ceph/osd/ceph-1/fsid (is another ceph-osd still
>>> running?)(11) Resource temporarily unavailable
>>>
>>>
>>> # path outside
>>> host1:~ # ll /var/lib/ceph/152fd738-01bc-11ec-a7fd-fa163e672db2/osd.1/
>>> insgesamt 60
>>> lrwxrwxrwx 1 ceph ceph   93 29. Sep 08:43 block ->
>>> /dev/ceph-b1ddff4b-95e8-4b91-b451-a3ea35d16ec0/osd-block-b4c772aa-07f8-483d-ae58-0ab97b8d0cc4
>>> lrwxrwxrwx 1 ceph ceph   90 29. Sep 08:43 block.db ->
>>> /dev/ceph-6f1b8f49-daf2-4631-a2ef-12e9452b01ea/osd-db-69b11aa0-af96-443e-8f03-5afa5272131f
>>> ---snip---
>>>
>>>
>>> But if I shutdown the OSD I can't access the block and block.db
>>> devices. I'm not even sure how this is supposed to work with
>>> cephadm. Maybe I'm misunderstanding, though. Or is there a way to
>>> provide the offline block.db path to 'ceph-volume lvm migrate'?
>>>
>>>
>>>
>>> Zitat von 胡 玮文 <huww98@xxxxxxxxxxx>:
>>>
>>>> You may need to use `ceph-volume lvm migrate’ [1] instead of
>>>> ceph-bluestore-tool. If I recall correctly, this is a pretty new
>>>> feature, I’m not sure whether it is available to your version.
>>>>
>>>> If you use ceph-bluestore-tool, then you need to modify the LVM
>>>> tags manually. Please refer to the previous threads, e.g. [2] and
>>>> some more.
>>>>
>>>> [1]: https://docs.ceph.com/en/latest/man/8/ceph-volume/#migrate
>>>> [2]:
>>>> https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/VX23NQ66P3PPEX36T3PYYMHPLBSFLMYA/#JLNDFGXR4ZLY27DHD3RJTTZEDHRZJO4Q
>>>>
>>>> 发件人: Szabo, Istvan (Agoda)<mailto:Istvan.Szabo@xxxxxxxxx>
>>>> 发送时间: 2021年9月28日 18:20
>>>> 收件人: Eugen Block<mailto:eblock@xxxxxx>;
>>>> ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
>>>> 主题:  Re: is it possible to remove the db+wal from an
>>>> external device (nvme)
>>>>
>>>> Gave a try of it, so all the 3 osds finally failed :/ Not sure
>>>> what went wrong.
>>>>
>>>> Do the normal maintenance things, ceph osd set noout, ceph osd set
>>>> norebalance, stop the osd and run this command:
>>>> ceph-bluestore-tool bluefs-bdev-migrate --dev-target
>>>> /var/lib/ceph/osd/ceph-0/block --devs-source
>>>> /var/lib/ceph/osd/ceph-8/block.db --path /var/lib/ceph/osd/ceph-8/
>>>> Output:
>>>> device removed:1 /var/lib/ceph/osd/ceph-8/block.db
>>>> device added: 1 /dev/dm-2
>>>>
>>>> When tried to start I got this in the log:
>>>> osd.8 0 OSD:init: unable to mount object store
>>>>  ** ERROR: osd init failed: (13) Permission denied
>>>> set uid:gid to 167:167 (ceph:ceph)
>>>> ceph version 15.2.13 (c44bc49e7a57a87d84dfff2a077a2058aa2172e2)
>>>> octopus (stable), process ceph-osd, pid 1512261
>>>> pidfile_write: ignore empty --pid-file
>>>>
>>>> From the another 2 osds the block.db removed and I can start it back.
>>>> I've zapped the db drive just to be removed from the device
>>>> completely and after machine restart none of these 2 osds came
>>>> back, I guess missing the db device.
>>>>
>>>> Is there any steps missing?
>>>> 1.Noout+norebalance
>>>> 2. Stop osd
>>>> 3. migrate with the above command the block.db to the block.
>>>> 4. do on the other osds which is sharing the same db device that
>>>> want to remove.
>>>> 5. zap the db device
>>>> 6. start back the osds.
>>>>
>>>> Istvan Szabo
>>>> Senior Infrastructure Engineer
>>>> ---------------------------------------------------
>>>> Agoda Services Co., Ltd.
>>>> e: istvan.szabo@xxxxxxxxx
>>>> ---------------------------------------------------
>>>>
>>>> -----Original Message-----
>>>> From: Eugen Block <eblock@xxxxxx>
>>>> Sent: Monday, September 27, 2021 7:42 PM
>>>> To: ceph-users@xxxxxxx
>>>> Subject:  Re: is it possible to remove the db+wal from
>>>> an external device (nvme)
>>>>
>>>> Email received from the internet. If in doubt, don't click any
>>>> link nor open any attachment !
>>>> ________________________________
>>>>
>>>> Hi,
>>>>
>>>> I think 'ceph-bluestore-tool bluefs-bdev-migrate' could be of use
>>>> here. I haven't tried it in a production environment yet, only in
>>>> virtual labs.
>>>>
>>>> Regards,
>>>> Eugen
>>>>
>>>>
>>>> Zitat von "Szabo, Istvan (Agoda)" <Istvan.Szabo@xxxxxxxxx>:
>>>>
>>>>> Hi,
>>>>>
>>>>> Seems like in our config the nvme device  as a wal+db in front of the
>>>>> ssd slowing down the ssds osds.
>>>>> I'd like to avoid to rebuild all the osd-, is there a way somehow
>>>>> migrate to the "slower device" the wal+db without reinstall?
>>>>>
>>>>> Ty
>>>>> _______________________________________________
>>>>> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
>>>>> email to ceph-users-leave@xxxxxxx
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send
>>>> an email to ceph-users-leave@xxxxxxx
>>>> _______________________________________________
>>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>>
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx



_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux