Ceph OSD get blocked and start to make inconsistent pg from time to time

Gonzalo Aguilar Delgado <gaguilar@xxxxxxxxxxxxxxxxxx> · Fri, 29 Sep 2017 11:23:30 +0200



    Hi, 

      
    I discovered that my
        cluster starts to make slow requests and all disk activity get
        blocked. 

      
    This happens once a
        day. And the ceph OSD get 100% CPU. In the ceph health I get
        something like:
    

    2017-09-29 10:49:01.227257  [INF] pgmap v67494428: 764 pgs: 1
        active+recovery_wait+degraded+inconsistent, 8
        active+recovery_wait+degraded, 749 active+clean, 6
        active+clean+inconsistent; 557 GB data, 1116 GB used, 3063 GB /
        4179 GB avail; 1205 B/s wr, 0 op/s; 11/287452 objects degraded
        (0.004%)

        2017-09-29 10:49:03.517776  [INF] pgmap v67494429: 764 pgs: 1
        active+recovery_wait+degraded+inconsistent, 8
        active+recovery_wait+degraded, 749 active+clean, 6
        active+clean+inconsistent; 557 GB data, 1116 GB used, 3063 GB /
        4179 GB avail; 53614 B/s wr, 1 op/s; 11/287452 objects degraded
        (0.004%)

        2017-09-29 10:48:56.557639  [WRN] 21 slow requests, 1
          included below; oldest blocked for > 143.328257 secs

        2017-09-29 10:48:56.557644  [WRN] slow request 60.839383
          seconds old, received at 2017-09-29 10:47:55.718212:
          osd_op(client.7189300.0:949334 9.f3418fe
          rb.0.21dc.238e1f29.000000007262 [set-alloc-hint object_size
          4194304 write_size 4194304,write 3211264~20480] snapc 0=[]
          ondisk+write+known_if_redirected e7435) currently waiting for
          subops from 0

        2017-09-29 10:49:03.558966  [WRN] 21 slow requests, 1
          included below; oldest blocked for > 150.329585 secs

        2017-09-29 10:49:03.558972  [WRN] slow request 120.752027
          seconds old, received at 2017-09-29 10:47:02.806894:
          osd_op(client.7189300.0:949307 9.583569cb
          rb.0.21dc.238e1f29.000000006a66 [set-alloc-hint object_size
          4194304 write_size 4194304,write 2465792~20480] snapc 0=[]
          ondisk+write+known_if_redirected e7435) currently waiting for
          subops from 0

        2017-09-29 10:49:04.651490  [INF] pgmap v67494430: 764 pgs: 1
        active+recovery_wait+degraded+inconsistent, 8
        active+recovery_wait+degraded, 749 active+clean, 6
        active+clean+inconsistent; 557 GB data, 1116 GB used, 3063 GB /
        4179 GB avail; 53620 B/s wr, 1 op/s; 11/287452 objects degraded
        (0.004%)

        2017-09-29 10:49:07.066544  [INF] pgmap v67494431: 764 pgs: 1
        active+recovery_wait+degraded+inconsistent, 8
        active+recovery_wait+degraded, 749 active+clean, 6
        active+clean+inconsistent; 557 GB data, 1116 GB used, 3063 GB /
        4179 GB avail; 11/287452 objects degraded (0.004%)

        ^Croot@gray-compute:~# ceph -w

            cluster 9028f4da-0d77-462b-be9b-dbdf7fa57771

             health HEALTH_ERR

                    9 pgs degraded

                    7 pgs inconsistent

                    9 pgs recovery_wait

                    37 requests are blocked > 32 sec

                    recovery 11/287452 objects degraded (0.004%)

                    42 scrub errors

                    too many PGs per OSD (305 > max 300)

             monmap e12: 2 mons at
        {blue-compute=172.16.0.119:6789/0,red-compute=172.16.0.100:6789/0}

                    election epoch 4966, quorum 0,1
        red-compute,blue-compute

              fsmap e891: 1/1/1 up {0=blue-compute=up:active}

             osdmap e7435: 5 osds: 5 up, 5 in

                    flags require_jewel_osds

              pgmap v67494432: 764 pgs, 6 pools, 557 GB data, 140
        kobjects

                    1116 GB used, 3063 GB / 4179 GB avail

                    11/287452 objects degraded (0.004%)

                         749 active+clean

                           8 active+recovery_wait+degraded

                           6 active+clean+inconsistent

                           1 active+recovery_wait+degraded+inconsistent

        
        2017-09-29 10:49:07.066544  [INF] pgmap v67494431: 764 pgs: 1
        active+recovery_wait+degraded+inconsistent, 8
        active+recovery_wait+degraded, 749 active+clean, 6
        active+clean+inconsistent; 557 GB data, 1116 GB used, 3063 GB /
        4179 GB avail; 11/287452 objects degraded (0.004%)

        2017-09-29 10:49:08.301813  [INF] pgmap v67494432: 764 pgs: 1
        active+recovery_wait+degraded+inconsistent, 8
        active+recovery_wait+degraded, 749 active+clean, 6
        active+clean+inconsistent; 557 GB data, 1116 GB used, 3063 GB /
        4179 GB avail; 11/287452 objects degraded (0.004%)

      
    The only solution I found is to go the machine where that OSD
      lives and reboot it. What I see in the logs:
    

    2017-09-29 10:50:19.800910 7f2a7eb128c0 20 --
        172.16.0.119:6802/5863 wait: stopped accepter thread

      2017-09-29 10:50:19.800917 7f2a7eb128c0 20 --
        172.16.0.119:6802/5863 wait: stopping reaper thread

      2017-09-29 10:50:19.800937 7f2a74f92700 10 --
        172.16.0.119:6802/5863 reaper_entry done

      2017-09-29 10:50:19.801008 7f2a7eb128c0 20 --
        172.16.0.119:6802/5863 wait: stopped reaper thread

      2017-09-29 10:50:19.801015 7f2a7eb128c0 10 --
        172.16.0.119:6802/5863 wait: closing pipes

      2017-09-29 10:50:19.801017 7f2a7eb128c0 10 --
        172.16.0.119:6802/5863 reaper

      2017-09-29 10:50:19.801019 7f2a7eb128c0 10 --
        172.16.0.119:6802/5863 reaper done

      2017-09-29 10:50:19.801020 7f2a7eb128c0 10 --
        172.16.0.119:6802/5863 wait: waiting for pipes  to close

      2017-09-29 10:50:19.801022 7f2a7eb128c0 10 --
        172.16.0.119:6802/5863 wait: done.

      2017-09-29 10:50:19.801024 7f2a7eb128c0  1 --
        172.16.0.119:6802/5863 shutdown complete.

      2017-09-29 10:50:19.801026 7f2a7eb128c0 10 -- :/5863
        wait: waiting for dispatch queue

      2017-09-29 10:50:19.801040 7f2a7eb128c0 10 -- :/5863
        wait: dispatch queue is stopped

      2017-09-29 10:50:19.801042 7f2a7eb128c0 20 -- :/5863
        wait: stopping reaper thread

      2017-09-29 10:50:19.801056 7f2a74791700 10 -- :/5863
        reaper_entry done

      2017-09-29 10:50:19.801088 7f2a7eb128c0 20 -- :/5863
        wait: stopped reaper thread

      2017-09-29 10:50:19.801094 7f2a7eb128c0 10 -- :/5863
        wait: closing pipes

      2017-09-29 10:50:19.801095 7f2a7eb128c0 10 -- :/5863
        reaper

      2017-09-29 10:50:19.801097 7f2a7eb128c0 10 -- :/5863
        reaper done

      2017-09-29 10:50:19.801098 7f2a7eb128c0 10 -- :/5863
        wait: waiting for pipes  to close

      2017-09-29 10:50:19.801099 7f2a7eb128c0 10 -- :/5863
        wait: done.

      2017-09-29 10:50:19.801100 7f2a7eb128c0  1 -- :/5863
        shutdown complete.

      
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com