Re: Filestore to Bluestore migration question

"Hector Martin \"marcan\"" <hector@xxxxxxxxxxxxxx> · Wed, 07 Nov 2018 06:31:26 +0900

If /dev/hdd67/data67 does not exist, try `vgchange -a y` and that should make it exist, then try again. Not sure why this would ever happen, though, since I expect lower level stuff to take care of activating LVM LVs.

If it does exist, I get the feeling that your original ceph-volume prepare command created the OSD filesystems in your root filesystem as files (probably because the OSD directories already existed for some reason). In that case, yes, you should re-create them, since the first time it wasn't done correctly. Before you do that, make sure you unmount the tmpfs that is now mounted, that no osd directories remain for your BlueStore OSDs, that you remove them from the mons, etc. You want to make sure your environment is clean so everything works as it should. Might be worth removing and re-creating the LVM LVs to make sure the tags are gone too.

On November 7, 2018 6:12:43 AM GMT+09:00, "Hayashida, Mami" <mami.hayashida@xxxxxxx> wrote:
This is becoming even more confusing. I got rid of those ceph-disk@6[0-9].service (which had been symlinked to /dev/null).  Moved /var/lib/ceph/osd/ceph-6[0-9] to  /var/...../osd_old/.  Then, I ran  `ceph-volume lvm activate --all`.  I got once again
root@osd1:~# ceph-volume lvm activate --all
--> Activating OSD ID 67 FSID 17cd6755-76f9-4160-906c-1bf13d09fb3d
Running command: mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-67
--> Absolute path not found for executable: restorecon
--> Ensure $PATH environment variable contains common executable locations
Running command: ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/hdd67/data67 --path /var/lib/ceph/osd/ceph-67
 stderr: failed to read label for /dev/hdd67/data67: (2) No such file or directory
-->  RuntimeError: command returned non-zero exit status: 1

But when I ran `df` and `mount` ceph-67 is the only one that exists. (and in  /var/lib/ceph/osd/) 

root@osd1:~# df -h | grep ceph-6
tmpfs           126G     0  126G   0% /var/lib/ceph/osd/ceph-67

root@osd1:~# mount | grep ceph-6
tmpfs on /var/lib/ceph/osd/ceph-67 type tmpfs (rw,relatime)

root@osd1:~# ls /var/lib/ceph/osd/ | grep ceph-6
ceph-67

But in I cannot restart any of these 10 daemons (`systemctl start ceph-osd@6[0-9]`).

I am wondering if I should zap these 10 osds and start over although at this point I am afraid even zapping may not be a simple task....

On Tue, Nov 6, 2018 at 3:44 PM, Hector Martin <hector@xxxxxxxxxxxxxx> wrote:
On 11/7/18 5:27 AM, Hayashida, Mami wrote:

> 1. Stopped osd.60-69:  no problem

> 2. Skipped this and went to #3 to check first

> 3. Here, `find /etc/systemd/system | grep ceph-volume` returned

> nothing.  I see in that directory 

> 

> /etc/systemd/system/ceph-disk@60.service    # and 61 - 69. 

> 

> No ceph-volume entries. 

Get rid of those, they also shouldn't be there. Then `systemctl

daemon-reload` and continue, see if you get into a good state. basically

feel free to nuke anything in there related to OSD 60-69, since whatever

is needed should be taken care of by the ceph-volume activation.

-- 

Hector Martin (hector@xxxxxxxxxxxxxx)

Public Key: https://mrcn.st/pub

-- 
Mami Hayashida
Research Computing Associate

Research Computing Infrastructure
University of Kentucky Information Technology Services 
301 Rose Street | 102 James F. Hardymon Building
Lexington, KY 40506-0495
mami.hayashida@xxxxxxx
(859)323-7521

-- 
Hector Martin "marcan" (hector@xxxxxxxxxxxxxx)
Public key: https://mrcn.st/pub_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com