Re: Is it normal for a orch osd rm drain to take so long?

"Zach Heise (SSCC)" <heise@xxxxxxxxxxxx> · Thu, 2 Dec 2021 15:41:56 -0600



    Can do
    ceph -s:
  cluster:
    id:     <fsid>
    health: HEALTH_OK
 
  services:
    mon: 4 daemons, quorum ceph05,ceph04,ceph01,ceph03 (age 4d)
    mgr: ceph01.fblojp(active, since 25h), standbys: ceph03.futetp
    mds: 1/1 daemons up, 1 standby
    osd: 32 osds: 32 up (since 9d), 31 in (since 2d); 15 remapped pgs
 
  data:
    volumes: 1/1 healthy
    pools:   6 pools, 161 pgs
    objects: 2.23k objects, 8.1 GiB
    usage:   34 GiB used, 66 TiB / 66 TiB avail
    pgs:     278/6682 objects misplaced (4.160%)
             146 active+clean
             15  active+clean+remapped
    full ceph osd df:
    ceph04.ssc.wisc.edu> ceph osd df
    ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA      OMAP    META     AVAIL    %USE  VAR    PGS  STATUS
     1    hdd  0.27280   1.00000  279 GiB  1.2 GiB   889 MiB  15 KiB  289 MiB  278 GiB  0.41   8.15   18      up
    27    hdd  0.27280   1.00000  279 GiB  1.3 GiB   955 MiB  17 KiB  394 MiB  278 GiB  0.47   9.33   23      up
    29    hdd  0.27280   1.00000  279 GiB  1.1 GiB   873 MiB   3 KiB  253 MiB  278 GiB  0.39   7.78   19      up
    31    hdd  0.27280   1.00000  279 GiB  110 MiB    22 MiB  10 KiB   88 MiB  279 GiB  0.04   0.76   18      up
    28    hdd  0.81870   1.00000  838 GiB  3.3 GiB   2.4 GiB  26 MiB  890 MiB  835 GiB  0.39   7.80   58      up
    33    ssd  0.36389   1.00000  373 GiB  781 MiB   750 MiB     0 B   31 MiB  372 GiB  0.20   4.05   23      up
     4    hdd  0.27280   1.00000  279 GiB  1.8 GiB   1.4 GiB   6 KiB  440 MiB  278 GiB  0.64  12.62   30      up
    32    ssd  0.36389   1.00000  373 GiB  541 MiB   511 MiB     0 B   31 MiB  372 GiB  0.14   2.81   30      up
     0    hdd  2.72899   1.00000  2.7 TiB  975 MiB   673 MiB   5 KiB  302 MiB  2.7 TiB  0.03   0.67   20      up
     2    hdd  2.72899   1.00000  2.7 TiB  1.9 GiB   1.2 GiB   3 KiB  700 MiB  2.7 TiB  0.07   1.36   17      up
     3    hdd  2.72899   1.00000  2.7 TiB  1.4 GiB  1022 MiB   6 KiB  389 MiB  2.7 TiB  0.05   0.98   20      up
     6    hdd  2.72899   1.00000  2.7 TiB  109 MiB    20 MiB   2 KiB   89 MiB  2.7 TiB  0.00   0.08    6      up
     7    hdd  2.72899   1.00000  2.7 TiB  126 MiB    30 MiB   3 KiB   96 MiB  2.7 TiB  0.00   0.09   13      up
     8    hdd  2.72899   1.00000  2.7 TiB  2.4 GiB   1.8 GiB  26 MiB  595 MiB  2.7 TiB  0.09   1.71   17      up
     9    hdd  2.72899   1.00000  2.7 TiB  1.4 GiB   1.0 GiB   3 KiB  422 MiB  2.7 TiB  0.05   1.02   20      up
    10    hdd  2.72899   1.00000  2.7 TiB  832 MiB   582 MiB   5 KiB  250 MiB  2.7 TiB  0.03   0.58   11      up
    11    hdd  2.72899   1.00000  2.7 TiB  763 MiB   511 MiB   6 KiB  252 MiB  2.7 TiB  0.03   0.53   17      up
    12    hdd  2.72899   1.00000  2.7 TiB  1.1 GiB   824 MiB   4 KiB  290 MiB  2.7 TiB  0.04   0.77   12      up
    13    hdd  2.72899   1.00000  2.7 TiB  1.1 GiB   807 MiB   4 KiB  352 MiB  2.7 TiB  0.04   0.80   12      up
    14    hdd  2.72899         0      0 B      0 B       0 B     0 B      0 B      0 B     0      0    1      up
    15    hdd  2.72899   1.00000  2.7 TiB  728 MiB   481 MiB   3 KiB  247 MiB  2.7 TiB  0.03   0.50   11      up
    16    hdd  2.72899   1.00000  2.7 TiB  1.1 GiB   835 MiB  10 KiB  322 MiB  2.7 TiB  0.04   0.80   21      up
    17    hdd  2.72899   1.00000  2.7 TiB  1.1 GiB   829 MiB   4 KiB  295 MiB  2.7 TiB  0.04   0.78   16      up
    18    hdd  2.72899   1.00000  2.7 TiB  1.7 GiB   1.2 GiB   7 KiB  531 MiB  2.7 TiB  0.06   1.19   16      up
    19    hdd  2.72899   1.00000  2.7 TiB  1.0 GiB   728 MiB   4 KiB  322 MiB  2.7 TiB  0.04   0.73   15      up
    20    hdd  2.72899   1.00000  2.7 TiB  1.1 GiB   762 MiB  10 KiB  389 MiB  2.7 TiB  0.04   0.80    8      up
    21    hdd  2.72899   1.00000  2.7 TiB  155 MiB    24 MiB  26 MiB  106 MiB  2.7 TiB  0.01   0.11   14      up
    22    hdd  2.72899   1.00000  2.7 TiB  1.9 GiB   1.4 GiB   3 KiB  538 MiB  2.7 TiB  0.07   1.33   13      up
    23    hdd  2.72899   1.00000  2.7 TiB  101 MiB    20 MiB   2 KiB   82 MiB  2.7 TiB  0.00   0.07   12      up
    24    hdd  2.72899   1.00000  2.7 TiB  547 MiB   406 MiB  15 KiB  142 MiB  2.7 TiB  0.02   0.38   12      up
    25    hdd  2.72899   1.00000  2.7 TiB  1.3 GiB   938 MiB   4 KiB  408 MiB  2.7 TiB  0.05   0.93   14      up
    26    hdd  2.72899   1.00000  2.7 TiB  1.1 GiB   827 MiB   4 KiB  284 MiB  2.7 TiB  0.04   0.77   10      up
                           TOTAL   66 TiB   34 GiB    24 GiB  77 MiB  9.6 GiB   66 TiB  0.05                    
    MIN/MAX VAR: 0.07/12.62  STDDEV: 0.17
    

    ceph04.ssc.wisc.edu> iostat
    Linux 5.4.151-1.el8.elrepo.x86_64 (ceph04.ssc.wisc.edu)         12/02/21        _x86_64_        (24 CPU)
    avg-cpu:  %user   %nice %system %iowait  %steal   %idle
               0.23    0.01    0.30    0.05    0.00   99.41
    Device             tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
    sdj              15.71       189.15      1087.20  899361787 5169458151
    sdb              15.79       198.68      1087.20  944678058 5169458151
    sda               0.16        18.13         1.40   86216232    6675304
    sdg               0.19        20.00         3.23   95095360   15357700
    sdh               0.18        19.60         2.17   93197576   10306048
    sdm               0.17        19.59         1.62   93166944    7715396
    sdk               0.17        19.17         0.98   91160628    4653560
    sdn               0.18        20.91         0.48   99404428    2294820
    sdf               0.16        17.77         0.03   84473116     148432
    sdl               0.17        19.59         1.29   93146740    6140328
    sdd               0.17        19.19         1.85   91257788    8816344
    sdi               0.17        18.87         1.07   89719756    5068364
    sdc               0.18        19.94         3.76   94810840   17890524
    sde               0.16        17.64         0.04   83896044     170476
    md127             0.02         2.51         0.00   11919016          0
    md126             0.02         2.51         0.00   11919427       3106
    md125            10.63         2.99      1090.95   14231587 5187271176
    md124             0.02         2.09         0.00    9935088          1
    dm-0              0.09         1.58         2.17    7504152   10306048
    dm-1              0.01         0.04         0.04     187900     170476
    dm-3              0.03         0.38         0.48    1802652    2294820
    dm-2              0.01         0.02         0.03     103212     148432
    dm-5              0.07         1.02         1.62    4826480    7715396
    dm-4              0.05         0.71         1.07    3364572    5068364
    dm-8              0.06         1.15         1.29    5468036    6140328
    dm-6              0.05         0.87         0.98    4143684    4653560
    dm-7              0.09         1.73         1.85    8211404    8816344
    dm-9              0.15         2.61         3.76   12426216   17890524
    dm-10             0.13         2.12         3.23   10063696   15357700
    dm-11             0.07         0.95         1.40    4493496    6675304
    /dev/sdn is where osd.14 is running. So, no it doesn't look like
      much activity is occuring on the disk versus sdj and sdb which are
      running the boot disks in RAID1.
    Lastly, I apologize but I'm not sure how to find logs
      specifically for one OSD? 

    
    Zach
      

    On 2021-12-02 2:52 PM, David Orman
      wrote:

    
      Hi,
        

        It would be good to have the full output. Does iostat show
          the backing device performing I/O? Additionally, what does
          ceph -s show for cluster state? Also, can you check the logs
          on that OSD, and see if anything looks abnormal?
        

        David
      
      
        On Thu, Dec 2, 2021 at 1:20 PM
          Zach Heise (SSCC) <heise@xxxxxxxxxxxx>
          wrote:

        
            Good morning David,
            Assuming you need/want to see the data about the other 31
              OSDs, 14 is showing:
            
              
                  ID

                  
                  CLASS

                  
                  WEIGHT

                  
                  REWEIGHT

                  
                  SIZE

                  
                  RAW USE

                  
                  DATA

                  
                  OMAP

                  
                  META

                  
                  AVAIL

                  
                  %USE

                  
                  VAR

                  
                  PGS

                  
                  STATUS

                  
                  14

                  
                  hdd

                  
                  2.72899

                  
                  0

                  
                  0 B

                  
                  0 B

                  
                  0 B

                  
                  0 B

                  
                  0 B

                  
                  0 B

                  
                  0

                  
                  0

                  
                  1

                  
                  up

                  
            Zach

              
            On 2021-12-01 5:20 PM, David Orman wrote:

            
              What's "ceph osd df" show?
              

                On Wed, Dec 1, 2021 at
                  2:20 PM Zach Heise (SSCC) <heise@xxxxxxxxxxxx>
                  wrote:

                
                    I wanted to swap out on existing OSD, preserve
                      the number, and then remove the HDD that had it
                      (osd.14 in this case) and give the ID of 14 to a
                      new SSD that would be taking its place in the same
                      node. First time ever doing this, so not sure what
                      to expect.
                     I followed the instructions here,
                      using the --replace flag.

                    
                    However, I'm a bit concerned that the operation
                      is taking so long in my test cluster. Out of 70TB
                      in the cluster, only 40GB were in use. This is a
                      relatively large OSD in comparison to others in
                      the cluster (2.7TB versus 300GB for most other
                      OSDs) and yet it's been 36 hours with the
                      following status:
                    ceph04.ssc.wisc.edu> ceph orch osd rm status
OSD_ID  HOST                 STATE     PG_COUNT  REPLACE  FORCE  DRAIN_STARTED_AT                  
14      ceph04.ssc.wisc.edu  draining  1         True     True   2021-11-30 15:22:23.469150+00:00


                    Another note: I don't know why it has the "force
                      = true" set; the command that I ran was just Ceph
                      orch osd rm 14 --replace, without specifying
                      --force. Hopefully not a big deal but still
                      strange.

                    
                    At this point is there any way to tell if it's
                      still actually doing something, or perhaps it is
                      hung? if it is hung, what would be the
                      'recommended' way to proceed? I know that I could
                      just manually eject the HDD from the chassis and
                      run the "ceph osd crush remove osd.14" command and
                      then manually delete the auth keys, etc, but the
                      documentation seems to state that this shouldn't
                      be necessary if a ceph OSD replacement goes
                      properly.

                    
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx