Re: Very Basic question

Luca Mazzaferro <luca.mazzaferro@xxxxxxxxxx> · Thu, 13 Nov 2014 18:33:55 +0100



    Hi,

      thank you for your answer:

      
      On 11/13/2014 06:17 PM, Gregory Farnum wrote:

    
    What does "ceph -s" output when things are working?

      
      Does the ceph.conf on your admin node 

    
    BEFORE the problem: from ceph -w because I don't have ceph -s

    
    [rzgceph@admin-node my-cluster]$ ceph -w

        cluster 6fa39bb3-de2d-4ec5-9a86-9d96231d8b5b

         health HEALTH_OK

         monmap e3: 3 mons at
    {ceph-node1=192.168.122.21:6789/0,ceph-node2=192.168.122.22:6789/0,ceph-node3=192.168.122.23:6789/0},

    election epoch 6, quorum 0,1,2 ceph-node1,ceph-node2,ceph-node3

         mdsmap e4: 1/1/1 up {0=ceph-node1=up:active}

         osdmap e13: 3 osds: 3 up, 3 in

          pgmap v26: 192 pgs, 3 pools, 1889 bytes data, 21 objects

                103 MB used, 76655 MB / 76759 MB avail

                     192 active+clean

    
    2014-11-13 17:08:43.240961 mon.0 [INF] pgmap v26: 192 pgs: 192
    active+clean; 1889 bytes data, 103 MB used, 76655 MB / 76759 MB
    avail; 8 B/s wr, 0 op/s

    
    contain the address of each monitor? (Paste is the
      relevant lines.) it will need to or the ceph tool won't be able to
      find the monitors even though the system is working.

    
    No only the initial one... but the documentation doesn't say it, but
    it is reasonable.

    I added the other two. This is my ceph.conf:

    
    [global]

    auth_service_required = cephx

    filestore_xattr_use_omap = true

    auth_client_required = cephx

    auth_cluster_required = cephx

    mon_host = 192.168.122.21 192.68.122.22 192.168.122.23

    mon_initial_members = ceph-node1

    fsid = 6fa39bb3-de2d-4ec5-9a86-9d96231d8b5b

    osd pool default size = 2

    public network = 192.168.0.0/16

    
    and then:

    ceph-deploy --overwrite-conf admin admin-node ceph-node1 ceph-node2
    ceph-node3

    
    and now:

    2014-11-13 18:24:57.522590 7fa4282d1700  0 --
    192.168.122.11:0/1003667 >> 192.168.122.23:6789/0
    pipe(0x7fa418001d40 sd=3 :0 s=1 pgs=0 cs=0 l=1
    c=0x7fa418001fb0).fault

    2014-11-13 18:25:06.524145 7fa4283d2700  0 --
    192.168.122.11:0/1003667 >> 192.168.122.23:6789/0
    pipe(0x7fa418002fa0 sd=3 :0 s=1 pgs=0 cs=0 l=1
    c=0x7fa418003210).fault

    2014-11-13 18:25:12.525096 7fa4283d2700  0 --
    192.168.122.11:0/1003667 >> 192.168.122.23:6789/0
    pipe(0x7fa418003bf0 sd=4 :0 s=1 pgs=0 cs=0 l=1
    c=0x7fa418003e60).fault

    2014-11-13 18:25:21.526622 7fa4282d1700  0 --
    192.168.122.11:0/1003667 >> 192.168.122.23:6789/0
    pipe(0x7fa4180085a0 sd=3 :0 s=1 pgs=0 cs=0 l=1
    c=0x7fa418008810).fault

    2014-11-13 18:25:33.528831 7fa4284d3700  0 --
    192.168.122.11:0/1003667 >> 192.168.122.23:6789/0
    pipe(0x7fa4180085a0 sd=3 :0 s=1 pgs=0 cs=0 l=1
    c=0x7fa418008810).fault

    2014-11-13 18:25:42.530185 7fa4284d3700  0 --
    192.168.122.11:0/1003667 >> 192.168.122.23:6789/0
    pipe(0x7fa418009740 sd=3 :0 s=1 pgs=0 cs=0 l=1
    c=0x7fa4180099b0).fault

    2014-11-13 18:25:51.531688 7fa4283d2700  0 --
    192.168.122.11:0/1003667 >> 192.168.122.23:6789/0
    pipe(0x7fa41800a330 sd=4 :0 s=1 pgs=0 cs=0 l=1
    c=0x7fa41800a5a0).fault

    2014-11-13 18:26:09.534223 7fa4284d3700  0 --
    192.168.122.11:0/1003667 >> 192.168.122.23:6789/0
    pipe(0x7fa41800d550 sd=3 :0 s=1 pgs=0 cs=0 l=1
    c=0x7fa41800e6b0).fault

    
    Better, someone (ceph-node3) answers but not in the right way I see.

    
         Luca

    
    -Greg

      On Thu, Nov 13, 2014 at 9:11 AM Luca
        Mazzaferro <luca.mazzaferro@xxxxxxxxxx>
        wrote:

        
            Hi,
          
          
              On 11/13/2014 06:05 PM, Artem Silenkov wrote:

            
              Hello! 
                

                Only 1 monitor instance? It won't work at most
                  cases. 
                Make more and ensure quorum to reach
                  survivalability. 
                

           No, three monitor
            instances, one for each ceph-node. As designed into the

            quick-ceph-deploy.

            
            I tried to kill one of them (the initial monitor) to see
            what happens and happens that.

            :-(

            Ciaoo
          

                Luca 

          
                Regards, Silenkov Artem 
                ---
                artem.silenkov@xxxxxxxxx
              
              
                2014-11-13 20:02 GMT+03:00 Luca
                  Mazzaferro <luca.mazzaferro@xxxxxxxxxx>:

                  Dear

                    Users,

                    I followed the instruction of the storage cluster
                    quick start here:

                    
                    http://ceph.com/docs/master/start/quick-ceph-deploy/

                    
                    I simulate a little storage with 4 VMs
                    ceph-node[1,2,3] and an admin-node.

                    Everything worked fine until I shut down the initial
                    monitor node (ceph-node1).

                    
                    Also with the other monitors on.

                    
                    I restart the ceph-node1 but the ceph command
                    (running from ceph-admin) fails after hanging for 5
                    minutes.

                    with this exit code:

                    2014-11-13 17:33:31.711410 7f6a5b1af700  0
                    monclient(hunting): authenticate timed out after 300

                    2014-11-13 17:33:31.711522 7f6a5b1af700  0 librados:
                    client.admin authentication error (110) Connection
                    timed out

                    
                    If I go to the ceph-node1 and restart the services:

                    
                    [root@ceph-node1 ~]# service ceph status

                    === mon.ceph-node1 ===

                    mon.ceph-node1: running {"version":"0.80.7"}

                    === osd.2 ===

                    osd.2: not running.

                    === mds.ceph-node1 ===

                    mds.ceph-node1: running {"version":"0.80.7"}

                    [root@ceph-node1 ~]# service ceph status

                    === mon.ceph-node1 ===

                    mon.ceph-node1: running {"version":"0.80.7"}

                    === osd.2 ===

                    osd.2: not running.

                    === mds.ceph-node1 ===

                    mds.ceph-node1: running {"version":"0.80.7"}

                    
                    I don't understand how to properly restart a node.

                    Can anyone help me?

                    Thank you.

                    Cheers.

                    
                        Luca

                    
                    _______________________________________________

                    ceph-users mailing list

                    ceph-users@xxxxxxxxxxxxxx

                    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

                  
          _______________________________________________

          ceph-users mailing list

          ceph-users@xxxxxxxxxxxxxx

          http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

        
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com