Re: How to remove /var/lib/ceph/osd/ceph-2?

Craig Lewis <clewis@xxxxxxxxxxxxxxxxxx> · Mon, 24 Jun 2013 14:23:39 -0700



    I also have problems keeping my time in
      sync on VMWare virtual machines.Б0Д2 My problems occurs most when the
      VM Host is oversubscribed, or when I'm doing stress tests.Б0Д2 I
      ended up disabling ntpd in the guests, and enabled Host Time Sync
      using the VMWare Guest Tools.Б0Д2 All of my VMWare Hosts runs ntpd,
      using the same ntpd servers.

      
      That's my development cluster.Б0Д2 For production, I'm using ntpd on
      real servers.

      
              Craig Lewis
              

               Senior Systems Engineer

                Office +1.714.602.1309

                Email clewis@xxxxxxxxxxxxxxxxxx
               
              Central Desktop.
                  Work together in ways you never thought possible.
                 

                   Connect with us Б0Д2 Website Б0Д2|Б0Д2 Twitter Б0Д2|Б0Д2 Facebook Б0Д2|Б0Д2 LinkedIn Б0Д2|Б0Д2 Blog  

                
      On 6/18/13 05:41 , Da Chun wrote:

    
        Thanks! Craig.
        umount works.
        

        About the time skew, I saw the log said the time difference
          should be less than 50ms. I setup one of my nodes as the time
          server, and the others sync the time with it. I don't know why
          the system time still changes frequently especially after
          reboot. Maybe it's because all my nodes are VMware virtual
          machines. The softclock is not accurate enough.
        

        ------------------Б0Д2OriginalБ0Д2------------------
        
          From: Б0Д2"Craig
            Lewis"<clewis@xxxxxxxxxxxxxxxxxx>;
          Date: Б0Д2Tue, Jun 18, 2013 05:34 AM
          To: Б0Д2"ceph-users"<ceph-users@xxxxxxxxxxxxxx>;
            
          Subject: Б0Д2Re:  How to remove
            /var/lib/ceph/osd/ceph-2?
        
        
        If you followed the standard setup,
          each OSD is it's own disk + filesystem.Б0Д2
          /var/lib/ceph/osd/ceph-2 is in use, as the mount point for the
          OSD.2 filesystem.Б0Д2 Double check by examining the output of the
          `mount` command.

          
          I get the same error when I try to rename a directory that's
          used as a mount point.

          
          Try `umount /var/lib/ceph/osd/ceph-2` instead of the mv and
          rm. The fuser command is telling you that the kernel has a
          filesystem mounted in that directory.Б0Д2 Nothing else appears to
          be using it, so the umount should complete successfully.

          
          Also, you should fix that time skew on mon.ceph-node5.Б0Д2 The
          mailing list archives have several posts with good answers.

          
          On 6/15/2013 2:14 AM, Da Chun wrote:

        
          Hi all,
          On Ubuntu 13.04 with ceph 0.61.3.
          I want to remove osd.2 from my cluster. The following steps were
              performed:
          
            root@ceph-node6:~# ceph osd out
                osd.2
            marked out osd.2.
            root@ceph-node6:~# ceph -w
            Б0Д2 Б0Д2health HEALTH_WARN clock skew
                detected on mon.ceph-node5
            Б0Д2 Б0Д2monmap e1: 3 mons at
                {ceph-node4=172.18.46.34:6789/0,ceph-node5=172.18.46.35:6789/0,ceph-node6=172.18.46.36:6789/0},

                election epoch 124, quorum 0,1,2
                ceph-node4,ceph-node5,ceph-node6
            Б0Д2 Б0Д2osdmap e414: 6 osds: 5 up, 5
                in
            Б0Д2 Б0Д2 pgmap v10540: 456 pgs: 456
                active+clean; 12171 MB data, 24325 MB used, 50360 MB /
                74685 MB avail
            Б0Д2 Б0Д2mdsmap e102: 1/1/1 up
                {0=ceph-node4=up:active}
            

            2013-06-15 16:55:22.096059 mon.0
                [INF] pgmap v10540: 456 pgs: 456 active+clean; 12171 MB
                data, 24325 MB used, 50360 MB / 74685 MB avail
            ^C
            root@ceph-node6:~# stop ceph-osd
                id=2
            ceph-osd stop/waiting
            root@ceph-node6:~# ceph osd crush
                remove osd.2
            removed item id 2 name 'osd.2'
                from crush map
            root@ceph-node6:~# ceph auth del
                osd.2
            updated
            root@ceph-node6:~# ceph osd rm 2
            removed osd.2
            root@ceph-node6:~# mv
                /var/lib/ceph/osd/ceph-2 /var/lib/ceph/osd/ceph-2.bak
            mv: cannot move
                бо/var/lib/ceph/osd/ceph-2бп to
                бо/var/lib/ceph/osd/ceph-2.bakбп: Device or resource busy
          
          
          Everything was working OK until the last step to remove
            the osd.2 directory /var/lib/ceph/osd/ceph-2.
          
            root@ceph-node6:~# fuser -v
                /var/lib/ceph/osd/ceph-2
            Б0Д2 Б0Д2 Б0Д2 Б0Д2 Б0Д2 Б0Д2 Б0Д2 Б0Д2 Б0Д2 Б0Д2 Б0Д2USER Б0Д2 Б0Д2 Б0Д2
                Б0Д2PID ACCESS COMMAND
            /var/lib/ceph/osd/ceph-2:
            Б0Д2 Б0Д2 Б0Д2 Б0Д2 Б0Д2 Б0Д2 Б0Д2 Б0Д2 Б0Д2 Б0Д2 Б0Д2root Б0Д2 Б0Д2
                kernel mount /var/lib/ceph/osd/ceph-2 Б0Д2
                ////////////////// What does this mean?
            root@ceph-node6:~# lsof +D
                /var/lib/ceph/osd/ceph-2
            root@ceph-node6:~#
          
          
          I restarted the system, and found that the osd.2 daemon
            was still running:
          
            root@ceph-node6:~# ps aux | grep
                osd
            root Б0Д2 Б0Д2 Б0Д21264 Б0Д21.4 12.3 550940
                125732 ? Б0Д2 Б0Д2 Б0Д2 Ssl Б0Д216:41 Б0Д2 0:20 /usr/bin/ceph-osd
                --cluster=ceph -i 2 -f
            root Б0Д2 Б0Д2 Б0Д22876 Б0Д20.0 Б0Д20.0 Б0Д2 4440 Б0Д2
                628 ? Б0Д2 Б0Д2 Б0Д2 Б0Д2Ss Б0Д2 16:44 Б0Д2 0:00 /bin/sh -e -c
                /usr/bin/ceph-osd --cluster="${cluster:-ceph}" -i "$id"
                -f /bin/sh
            root Б0Д2 Б0Д2 Б0Д22877 Б0Д24.9 18.2 613780
                185676 ? Б0Д2 Б0Д2 Б0Д2 Sl Б0Д2 16:44 Б0Д2 1:04 /usr/bin/ceph-osd
                --cluster=ceph -i 5 -f
          
          
          I have to take this workaround:
          
            root@ceph-node6:~# rm -rf
                /var/lib/ceph/osd/ceph-2
            rm: cannot remove
                бо/var/lib/ceph/osd/ceph-2бп: Device or resource busy
            root@ceph-node6:~# ls
                /var/lib/ceph/osd/ceph-2
            root@ceph-node6:~# shutdown -r
                now
          
          ....
          
            root@ceph-node6:~# ps aux | grep
                osd
            root Б0Д2 Б0Д2 Б0Д21416 Б0Д20.0 Б0Д20.0 Б0Д2 4440 Б0Д2
                628 ? Б0Д2 Б0Д2 Б0Д2 Б0Д2Ss Б0Д2 17:10 Б0Д2 0:00 /bin/sh -e -c
                /usr/bin/ceph-osd --cluster="${cluster:-ceph}" -i "$id"
                -f /bin/sh
            root Б0Д2 Б0Д2 Б0Д21417 Б0Д28.9 Б0Д25.8 468052
                59868 ? Б0Д2 Б0Д2 Б0Д2 Б0Д2Sl Б0Д2 17:10 Б0Д2 0:02 /usr/bin/ceph-osd
                --cluster=ceph -i 5 -f
          
          
            root@ceph-node6:~# rm -r
                /var/lib/ceph/osd/ceph-2
            root@ceph-node6:~#
          
          
          Any idea? HELP!
          

          _______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

        
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com