Re: How to remove /var/lib/ceph/osd/ceph-2?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I also have problems keeping my time in sync on VMWare virtual machines.  My problems occurs most when the VM Host is oversubscribed, or when I'm doing stress tests.  I ended up disabling ntpd in the guests, and enabled Host Time Sync using the VMWare Guest Tools.  All of my VMWare Hosts runs ntpd, using the same ntpd servers.

That's my development cluster.  For production, I'm using ntpd on real servers.



Craig Lewis
Senior Systems Engineer
Office +1.714.602.1309
Email clewis@xxxxxxxxxxxxxxxxxx

Central Desktop. Work together in ways you never thought possible.
Connect with us   Website  |  Twitter  |  Facebook  |  LinkedIn  |  Blog

On 6/18/13 05:41 , Da Chun wrote:

Thanks! Craig.
umount works.

About the time skew, I saw the log said the time difference should be less than 50ms. I setup one of my nodes as the time server, and the others sync the time with it. I don't know why the system time still changes frequently especially after reboot. Maybe it's because all my nodes are VMware virtual machines. The softclock is not accurate enough.

------------------ Original ------------------
From:  "Craig Lewis"<clewis@xxxxxxxxxxxxxxxxxx>;
Date:  Tue, Jun 18, 2013 05:34 AM
To:  "ceph-users"<ceph-users@xxxxxxxxxxxxxx>;
Subject:  Re: How to remove /var/lib/ceph/osd/ceph-2?

If you followed the standard setup, each OSD is it's own disk + filesystem.  /var/lib/ceph/osd/ceph-2 is in use, as the mount point for the OSD.2 filesystem.  Double check by examining the output of the `mount` command.

I get the same error when I try to rename a directory that's used as a mount point.

Try `umount /var/lib/ceph/osd/ceph-2` instead of the mv and rm. The fuser command is telling you that the kernel has a filesystem mounted in that directory.  Nothing else appears to be using it, so the umount should complete successfully.


Also, you should fix that time skew on mon.ceph-node5.  The mailing list archives have several posts with good answers.


On 6/15/2013 2:14 AM, Da Chun wrote:
Hi all,
On Ubuntu 13.04 with ceph 0.61.3.
I want to remove osd.2 from my cluster. The following steps were performed:
root@ceph-node6:~# ceph osd out osd.2
marked out osd.2.
root@ceph-node6:~# ceph -w
   health HEALTH_WARN clock skew detected on mon.ceph-node5
   monmap e1: 3 mons at {ceph-node4=172.18.46.34:6789/0,ceph-node5=172.18.46.35:6789/0,ceph-node6=172.18.46.36:6789/0}, election epoch 124, quorum 0,1,2 ceph-node4,ceph-node5,ceph-node6
   osdmap e414: 6 osds: 5 up, 5 in
    pgmap v10540: 456 pgs: 456 active+clean; 12171 MB data, 24325 MB used, 50360 MB / 74685 MB avail
   mdsmap e102: 1/1/1 up {0=ceph-node4=up:active}

2013-06-15 16:55:22.096059 mon.0 [INF] pgmap v10540: 456 pgs: 456 active+clean; 12171 MB data, 24325 MB used, 50360 MB / 74685 MB avail
^C
root@ceph-node6:~# stop ceph-osd id=2
ceph-osd stop/waiting
root@ceph-node6:~# ceph osd crush remove osd.2
removed item id 2 name 'osd.2' from crush map
root@ceph-node6:~# ceph auth del osd.2
updated
root@ceph-node6:~# ceph osd rm 2
removed osd.2
root@ceph-node6:~# mv /var/lib/ceph/osd/ceph-2 /var/lib/ceph/osd/ceph-2.bak
mv: cannot move ‘/var/lib/ceph/osd/ceph-2’ to ‘/var/lib/ceph/osd/ceph-2.bak’: Device or resource busy

Everything was working OK until the last step to remove the osd.2 directory /var/lib/ceph/osd/ceph-2.
root@ceph-node6:~# fuser -v /var/lib/ceph/osd/ceph-2
                     USER        PID ACCESS COMMAND
/var/lib/ceph/osd/ceph-2:
                     root     kernel mount /var/lib/ceph/osd/ceph-2   ////////////////// What does this mean?
root@ceph-node6:~# lsof +D /var/lib/ceph/osd/ceph-2
root@ceph-node6:~#

I restarted the system, and found that the osd.2 daemon was still running:
root@ceph-node6:~# ps aux | grep osd
root      1264  1.4 12.3 550940 125732 ?       Ssl  16:41   0:20 /usr/bin/ceph-osd --cluster=ceph -i 2 -f
root      2876  0.0  0.0   4440   628 ?        Ss   16:44   0:00 /bin/sh -e -c /usr/bin/ceph-osd --cluster="${cluster:-ceph}" -i "$id" -f /bin/sh
root      2877  4.9 18.2 613780 185676 ?       Sl   16:44   1:04 /usr/bin/ceph-osd --cluster=ceph -i 5 -f

I have to take this workaround:
root@ceph-node6:~# rm -rf /var/lib/ceph/osd/ceph-2
rm: cannot remove ‘/var/lib/ceph/osd/ceph-2’: Device or resource busy
root@ceph-node6:~# ls /var/lib/ceph/osd/ceph-2
root@ceph-node6:~# shutdown -r now
....
root@ceph-node6:~# ps aux | grep osd
root      1416  0.0  0.0   4440   628 ?        Ss   17:10   0:00 /bin/sh -e -c /usr/bin/ceph-osd --cluster="${cluster:-ceph}" -i "$id" -f /bin/sh
root      1417  8.9  5.8 468052 59868 ?        Sl   17:10   0:02 /usr/bin/ceph-osd --cluster=ceph -i 5 -f
root@ceph-node6:~# rm -r /var/lib/ceph/osd/ceph-2
root@ceph-node6:~#

Any idea? HELP!



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux