Yes. And “cephadm shell” command does not depend on the running daemon, it will start a new container. So I think it is perfectly fine to stop the OSD first then run the “cephadm shell” command, and run ceph-volume in the new shell. 发件人: Eugen Block<mailto:eblock@xxxxxx> 发送时间: 2021年9月29日 21:40 收件人: 胡 玮文<mailto:huww98@xxxxxxxxxxx> 抄送: Igor Fedotov<mailto:ifedotov@xxxxxxx>; Szabo, Istvan (Agoda)<mailto:Istvan.Szabo@xxxxxxxxx>; ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> 主题: Re: is it possible to remove the db+wal from an external device (nvme) The OSD has to be stopped in order to migrate DB/WAL, it can't be done live. ceph-volume requires a lock on the device. Zitat von 胡 玮文 <huww98@xxxxxxxxxxx>: > I’ve not tried it, but how about: > > cephadm shell -n osd.0 > > then run “ceph-volume” commands in the newly opened shell. The > directory structure seems fine. > > $ sudo cephadm shell -n osd.0 > Inferring fsid e88d509a-f6fc-11ea-b25d-a0423f3ac864 > Inferring config > /var/lib/ceph/e88d509a-f6fc-11ea-b25d-a0423f3ac864/osd.0/config > Using recent ceph image > cr.example.com/infra/ceph@sha256:8a0f6f285edcd6488e2c91d3f9fa43534d37d7a9b37db1e0ff6691aae6466530 > root@host0:/# ll /var/lib/ceph/osd/ceph-0/ > total 68 > drwx------ 2 ceph ceph 4096 Sep 20 04:15 ./ > drwxr-x--- 1 ceph ceph 4096 Sep 29 13:32 ../ > lrwxrwxrwx 1 ceph ceph 24 Sep 20 04:15 block -> /dev/ceph-hdd/osd.0.data > lrwxrwxrwx 1 ceph ceph 23 Sep 20 04:15 block.db -> /dev/ubuntu-vg/osd.0.db > -rw------- 1 ceph ceph 37 Sep 20 04:15 ceph_fsid > -rw------- 1 ceph ceph 387 Jun 21 13:24 config > -rw------- 1 ceph ceph 37 Sep 20 04:15 fsid > -rw------- 1 ceph ceph 55 Sep 20 04:15 keyring > -rw------- 1 ceph ceph 6 Sep 20 04:15 ready > -rw------- 1 ceph ceph 3 Apr 2 01:46 require_osd_release > -rw------- 1 ceph ceph 10 Sep 20 04:15 type > -rw------- 1 ceph ceph 38 Sep 17 14:26 unit.configured > -rw------- 1 ceph ceph 48 Nov 9 2020 unit.created > -rw------- 1 ceph ceph 35 Sep 17 14:26 unit.image > -rw------- 1 ceph ceph 306 Sep 17 14:26 unit.meta > -rw------- 1 ceph ceph 1317 Sep 17 14:26 unit.poststop > -rw------- 1 ceph ceph 3021 Sep 17 14:26 unit.run > -rw------- 1 ceph ceph 142 Sep 17 14:26 unit.stop > -rw------- 1 ceph ceph 2 Sep 20 04:15 whoami > > 发件人: Eugen Block<mailto:eblock@xxxxxx> > 发送时间: 2021年9月29日 21:29 > 收件人: Igor Fedotov<mailto:ifedotov@xxxxxxx> > 抄送: 胡 玮文<mailto:huww98@xxxxxxxxxxx>; Szabo, Istvan > (Agoda)<mailto:Istvan.Szabo@xxxxxxxxx>; > ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> > 主题: Re: Re: 回复: [ceph-users] Re: is it possible to > remove the db+wal from an external device (nvme) > > Hi Igor, > > thanks for your input. I haven't done this in a prod env yet either, > still playing around in a virtual lab env. > I tried the symlink suggestion but it's not that easy, because it > looks different underneath the ceph directory than ceph-volume expects > it. These are the services underneath: > > ses7-host1:~ # ll /var/lib/ceph/152fd738-01bc-11ec-a7fd-fa163e672db2/ > insgesamt 48 > drwx------ 3 root root 4096 16. Sep 16:11 alertmanager.ses7-host1 > drwx------ 3 ceph ceph 4096 29. Sep 09:03 crash > drwx------ 2 ceph ceph 4096 16. Sep 16:39 crash.ses7-host1 > drwx------ 4 messagebus lp 4096 16. Sep 16:23 grafana.ses7-host1 > drw-rw---- 2 root root 4096 24. Aug 10:00 home > drwx------ 2 ceph ceph 4096 16. Sep 16:37 mgr.ses7-host1.wmgyit > drwx------ 3 ceph ceph 4096 16. Sep 16:37 mon.ses7-host1 > drwx------ 2 nobody nobody 4096 16. Sep 16:37 node-exporter.ses7-host1 > drwx------ 2 ceph ceph 4096 29. Sep 08:43 osd.0 > drwx------ 2 ceph ceph 4096 29. Sep 15:11 osd.1 > drwx------ 4 root root 4096 16. Sep 16:12 prometheus.ses7-host1 > > > While the directory in a non-containerized deployment looks like this: > > nautilus:~ # ll /var/lib/ceph/osd/ceph-0/ > insgesamt 24 > lrwxrwxrwx 1 ceph ceph 93 29. Sep 12:21 block -> > /dev/ceph-a6d78a29-637f-494b-a839-76251fcff67e/osd-block-39340a48-54b3-4689-9896-f54d005c535d > -rw------- 1 ceph ceph 37 29. Sep 12:21 ceph_fsid > -rw------- 1 ceph ceph 37 29. Sep 12:21 fsid > -rw------- 1 ceph ceph 55 29. Sep 12:21 keyring > -rw------- 1 ceph ceph 6 29. Sep 12:21 ready > -rw------- 1 ceph ceph 10 29. Sep 12:21 type > -rw------- 1 ceph ceph 2 29. Sep 12:21 whoami > > > But even if I create the symlink to the osd directory it fails because > I only have ceph-volume within the containers where the symlink is not > visible to cephadm. > > > ses7-host1:~ # ll /var/lib/ceph/osd/ceph-1 > lrwxrwxrwx 1 root root 57 29. Sep 15:08 /var/lib/ceph/osd/ceph-1 -> > /var/lib/ceph/152fd738-01bc-11ec-a7fd-fa163e672db2/osd.1/ > > ses7-host1:~ # cephadm ceph-volume lvm migrate --osd-id 1 --osd-fsid > b4c772aa-07f8-483d-ae58-0ab97b8d0cc4 --from db --target > ceph-b1ddff4b-95e8-4b91-b451-a3ea35d16ec0/osd-block-b4c772aa-07f8-483d-ae58-0ab97b8d0cc4 > Inferring fsid 152fd738-01bc-11ec-a7fd-fa163e672db2 > [...] > /usr/bin/podman: stderr --> Migrate to existing, Source: > ['--devs-source', '/var/lib/ceph/osd/ceph-1/block.db'] Target: > /var/lib/ceph/osd/ceph-1/block > /usr/bin/podman: stderr stdout: inferring bluefs devices from bluestore path > /usr/bin/podman: stderr stderr: can't migrate > /var/lib/ceph/osd/ceph-1/block.db, not a valid bluefs volume > /usr/bin/podman: stderr --> Failed to migrate device, error code:1 > /usr/bin/podman: stderr --> Undoing lv tag set > /usr/bin/podman: stderr Failed to migrate to : > ceph-b1ddff4b-95e8-4b91-b451-a3ea35d16ec0/osd-block-b4c772aa-07f8-483d-ae58-0ab97b8d0cc4 > Traceback (most recent call last): > File "/usr/sbin/cephadm", line 6225, in <module> > r = args.func() > File "/usr/sbin/cephadm", line 1363, in _infer_fsid > return func() > File "/usr/sbin/cephadm", line 1422, in _infer_image > return func() > File "/usr/sbin/cephadm", line 3687, in command_ceph_volume > out, err, code = call_throws(c.run_cmd(), > verbosity=CallVerbosity.VERBOSE) > File "/usr/sbin/cephadm", line 1101, in call_throws > raise RuntimeError('Failed command: %s' % ' '.join(command)) > [...] > > > I could install the package ceph-osd (where ceph-volume is packaged > in) but it's not available by default (as you see this is a SES 7 > environment). > > I'm not sure what the design is here, it feels like the ceph-volume > migrate command is not applicable to containers yet. > > Regards, > Eugen > > > Zitat von Igor Fedotov <ifedotov@xxxxxxx>: > >> Hi Eugen, >> >> indeed this looks like an issue related to containerized deployment, >> "ceph-volume lvm migrate" expects osd folder to be under >> /var/lib/ceph/osd: >> >>> stderr: 2021-09-29T06:56:24.787+0000 7fde05b96180 -1 >>> bluestore(/var/lib/ceph/osd/ceph-1) _lock_fsid failed to lock >>> /var/lib/ceph/osd/ceph-1/fsid (is another ceph-osd still >>> running?)(11) Resource temporarily unavailable >> >> As a workaround you might want to try to create a symlink to your >> actual location before issuing the migrate command: >> /var/lib/ceph/osd -> >> /var/lib/ceph/152fd738-01bc-11ec-a7fd-fa163e672db2/ >> >> More complicated (and more general IMO) way would be to run the >> migrate command from within a container deployed similarly (i.e. >> with all the proper subfolder mappings) to ceph-osd one. Just >> speculating - not a big expert in containers and never tried that >> with properly deployed production cluster... >> >> >> Thanks, >> >> Igor >> >> On 9/29/2021 10:07 AM, Eugen Block wrote: >>> Hi, >>> >>> I just tried with 'ceph-volume lvm migrate' in Octopus but it >>> doesn't really work. I'm not sure if I'm missing something here, >>> but I believe it's again the already discussed containers issue. To >>> be able to run the command for an OSD the OSD has to be offline, >>> but then you don't have access to the block.db because the path is >>> different from outside the container: >>> >>> ---snip--- >>> [ceph: root@host1 /]# ceph-volume lvm migrate --osd-id 1 --osd-fsid >>> b4c772aa-07f8-483d-ae58-0ab97b8d0cc4 --from db --target >>> ceph-b1ddff4b-95e8-4b91-b451-a3ea35d16ec0/osd-block-b4c772aa-07f8-483d-ae58-0ab97b8d0cc4 --> Migrate to existing, Source: ['--devs-source', '/var/lib/ceph/osd/ceph-1/block.db'] >>> Target: >>> /var/lib/ceph/osd/ceph-1/block >>> stdout: inferring bluefs devices from bluestore path >>> stderr: >>> /home/abuild/rpmbuild/BUILD/ceph-15.2.14-84-gb6e5642e260/src/os/bluestore/BlueStore.cc: In function 'int BlueStore::_mount_for_bluefs()' thread 7fde05b96180 >>> time >>> 2021-09-29T06:56:24.790161+0000 >>> stderr: >>> /home/abuild/rpmbuild/BUILD/ceph-15.2.14-84-gb6e5642e260/src/os/bluestore/BlueStore.cc: 6876: FAILED ceph_assert(r >>> == >>> 0) >>> stderr: 2021-09-29T06:56:24.787+0000 7fde05b96180 -1 >>> bluestore(/var/lib/ceph/osd/ceph-1) _lock_fsid failed to lock >>> /var/lib/ceph/osd/ceph-1/fsid (is another ceph-osd still >>> running?)(11) Resource temporarily unavailable >>> >>> >>> # path outside >>> host1:~ # ll /var/lib/ceph/152fd738-01bc-11ec-a7fd-fa163e672db2/osd.1/ >>> insgesamt 60 >>> lrwxrwxrwx 1 ceph ceph 93 29. Sep 08:43 block -> >>> /dev/ceph-b1ddff4b-95e8-4b91-b451-a3ea35d16ec0/osd-block-b4c772aa-07f8-483d-ae58-0ab97b8d0cc4 >>> lrwxrwxrwx 1 ceph ceph 90 29. Sep 08:43 block.db -> >>> /dev/ceph-6f1b8f49-daf2-4631-a2ef-12e9452b01ea/osd-db-69b11aa0-af96-443e-8f03-5afa5272131f >>> ---snip--- >>> >>> >>> But if I shutdown the OSD I can't access the block and block.db >>> devices. I'm not even sure how this is supposed to work with >>> cephadm. Maybe I'm misunderstanding, though. Or is there a way to >>> provide the offline block.db path to 'ceph-volume lvm migrate'? >>> >>> >>> >>> Zitat von 胡 玮文 <huww98@xxxxxxxxxxx>: >>> >>>> You may need to use `ceph-volume lvm migrate’ [1] instead of >>>> ceph-bluestore-tool. If I recall correctly, this is a pretty new >>>> feature, I’m not sure whether it is available to your version. >>>> >>>> If you use ceph-bluestore-tool, then you need to modify the LVM >>>> tags manually. Please refer to the previous threads, e.g. [2] and >>>> some more. >>>> >>>> [1]: https://docs.ceph.com/en/latest/man/8/ceph-volume/#migrate >>>> [2]: >>>> https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/VX23NQ66P3PPEX36T3PYYMHPLBSFLMYA/#JLNDFGXR4ZLY27DHD3RJTTZEDHRZJO4Q >>>> >>>> 发件人: Szabo, Istvan (Agoda)<mailto:Istvan.Szabo@xxxxxxxxx> >>>> 发送时间: 2021年9月28日 18:20 >>>> 收件人: Eugen Block<mailto:eblock@xxxxxx>; >>>> ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> >>>> 主题: Re: is it possible to remove the db+wal from an >>>> external device (nvme) >>>> >>>> Gave a try of it, so all the 3 osds finally failed :/ Not sure >>>> what went wrong. >>>> >>>> Do the normal maintenance things, ceph osd set noout, ceph osd set >>>> norebalance, stop the osd and run this command: >>>> ceph-bluestore-tool bluefs-bdev-migrate --dev-target >>>> /var/lib/ceph/osd/ceph-0/block --devs-source >>>> /var/lib/ceph/osd/ceph-8/block.db --path /var/lib/ceph/osd/ceph-8/ >>>> Output: >>>> device removed:1 /var/lib/ceph/osd/ceph-8/block.db >>>> device added: 1 /dev/dm-2 >>>> >>>> When tried to start I got this in the log: >>>> osd.8 0 OSD:init: unable to mount object store >>>> ** ERROR: osd init failed: (13) Permission denied >>>> set uid:gid to 167:167 (ceph:ceph) >>>> ceph version 15.2.13 (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) >>>> octopus (stable), process ceph-osd, pid 1512261 >>>> pidfile_write: ignore empty --pid-file >>>> >>>> From the another 2 osds the block.db removed and I can start it back. >>>> I've zapped the db drive just to be removed from the device >>>> completely and after machine restart none of these 2 osds came >>>> back, I guess missing the db device. >>>> >>>> Is there any steps missing? >>>> 1.Noout+norebalance >>>> 2. Stop osd >>>> 3. migrate with the above command the block.db to the block. >>>> 4. do on the other osds which is sharing the same db device that >>>> want to remove. >>>> 5. zap the db device >>>> 6. start back the osds. >>>> >>>> Istvan Szabo >>>> Senior Infrastructure Engineer >>>> --------------------------------------------------- >>>> Agoda Services Co., Ltd. >>>> e: istvan.szabo@xxxxxxxxx >>>> --------------------------------------------------- >>>> >>>> -----Original Message----- >>>> From: Eugen Block <eblock@xxxxxx> >>>> Sent: Monday, September 27, 2021 7:42 PM >>>> To: ceph-users@xxxxxxx >>>> Subject: Re: is it possible to remove the db+wal from >>>> an external device (nvme) >>>> >>>> Email received from the internet. If in doubt, don't click any >>>> link nor open any attachment ! >>>> ________________________________ >>>> >>>> Hi, >>>> >>>> I think 'ceph-bluestore-tool bluefs-bdev-migrate' could be of use >>>> here. I haven't tried it in a production environment yet, only in >>>> virtual labs. >>>> >>>> Regards, >>>> Eugen >>>> >>>> >>>> Zitat von "Szabo, Istvan (Agoda)" <Istvan.Szabo@xxxxxxxxx>: >>>> >>>>> Hi, >>>>> >>>>> Seems like in our config the nvme device as a wal+db in front of the >>>>> ssd slowing down the ssds osds. >>>>> I'd like to avoid to rebuild all the osd-, is there a way somehow >>>>> migrate to the "slower device" the wal+db without reinstall? >>>>> >>>>> Ty >>>>> _______________________________________________ >>>>> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an >>>>> email to ceph-users-leave@xxxxxxx >>>> >>>> >>>> >>>> _______________________________________________ >>>> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send >>>> an email to ceph-users-leave@xxxxxxx >>>> _______________________________________________ >>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>> >>> >>> >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@xxxxxxx >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx