Re: Ceph OSD get blocked and start to make inconsistent pg from time to time

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm going to assume you're dealing with your scrub errors and have a game plan for those as you didn't mention them in your question at all.

One thing I'm always leary of when I see blocked requests happening is that the PGs might be splitting subfolders.  It is pretty much a guarantee if you're also seeing OSDs marking themselves down for short periods of time.  You can check this by searching the OSD logs for "wrongly marked me down".  Here is a previous ML thread with some information about this.  It also goes into the options of RBD snapshots causing some problems.

https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg36923.html

On Fri, Sep 29, 2017 at 5:23 AM Gonzalo Aguilar Delgado <gaguilar@xxxxxxxxxxxxxxxxxx> wrote:

Hi,

I discovered that my cluster starts to make slow requests and all disk activity get blocked.

This happens once a day. And the ceph OSD get 100% CPU. In the ceph health I get something like:


2017-09-29 10:49:01.227257  [INF] pgmap v67494428: 764 pgs: 1 active+recovery_wait+degraded+inconsistent, 8 active+recovery_wait+degraded, 749 active+clean, 6 active+clean+inconsistent; 557 GB data, 1116 GB used, 3063 GB / 4179 GB avail; 1205 B/s wr, 0 op/s; 11/287452 objects degraded (0.004%)
2017-09-29 10:49:03.517776  [INF] pgmap v67494429: 764 pgs: 1 active+recovery_wait+degraded+inconsistent, 8 active+recovery_wait+degraded, 749 active+clean, 6 active+clean+inconsistent; 557 GB data, 1116 GB used, 3063 GB / 4179 GB avail; 53614 B/s wr, 1 op/s; 11/287452 objects degraded (0.004%)
2017-09-29 10:48:56.557639  [WRN] 21 slow requests, 1 included below; oldest blocked for > 143.328257 secs
2017-09-29 10:48:56.557644  [WRN] slow request 60.839383 seconds old, received at 2017-09-29 10:47:55.718212: osd_op(client.7189300.0:949334 9.f3418fe rb.0.21dc.238e1f29.000000007262 [set-alloc-hint object_size 4194304 write_size 4194304,write 3211264~20480] snapc 0=[] ondisk+write+known_if_redirected e7435) currently waiting for subops from 0
2017-09-29 10:49:03.558966  [WRN] 21 slow requests, 1 included below; oldest blocked for > 150.329585 secs
2017-09-29 10:49:03.558972  [WRN] slow request 120.752027 seconds old, received at 2017-09-29 10:47:02.806894: osd_op(client.7189300.0:949307 9.583569cb rb.0.21dc.238e1f29.000000006a66 [set-alloc-hint object_size 4194304 write_size 4194304,write 2465792~20480] snapc 0=[] ondisk+write+known_if_redirected e7435) currently waiting for subops from 0
2017-09-29 10:49:04.651490  [INF] pgmap v67494430: 764 pgs: 1 active+recovery_wait+degraded+inconsistent, 8 active+recovery_wait+degraded, 749 active+clean, 6 active+clean+inconsistent; 557 GB data, 1116 GB used, 3063 GB / 4179 GB avail; 53620 B/s wr, 1 op/s; 11/287452 objects degraded (0.004%)
2017-09-29 10:49:07.066544  [INF] pgmap v67494431: 764 pgs: 1 active+recovery_wait+degraded+inconsistent, 8 active+recovery_wait+degraded, 749 active+clean, 6 active+clean+inconsistent; 557 GB data, 1116 GB used, 3063 GB / 4179 GB avail; 11/287452 objects degraded (0.004%)
^Croot@gray-compute:~# ceph -w
    cluster 9028f4da-0d77-462b-be9b-dbdf7fa57771
     health HEALTH_ERR
            9 pgs degraded
            7 pgs inconsistent
            9 pgs recovery_wait
            37 requests are blocked > 32 sec
            recovery 11/287452 objects degraded (0.004%)
            42 scrub errors
            too many PGs per OSD (305 > max 300)
     monmap e12: 2 mons at {blue-compute=172.16.0.119:6789/0,red-compute=172.16.0.100:6789/0}
            election epoch 4966, quorum 0,1 red-compute,blue-compute
      fsmap e891: 1/1/1 up {0=blue-compute=up:active}
     osdmap e7435: 5 osds: 5 up, 5 in
            flags require_jewel_osds
      pgmap v67494432: 764 pgs, 6 pools, 557 GB data, 140 kobjects
            1116 GB used, 3063 GB / 4179 GB avail
            11/287452 objects degraded (0.004%)
                 749 active+clean
                   8 active+recovery_wait+degraded
                   6 active+clean+inconsistent
                   1 active+recovery_wait+degraded+inconsistent

2017-09-29 10:49:07.066544  [INF] pgmap v67494431: 764 pgs: 1 active+recovery_wait+degraded+inconsistent, 8 active+recovery_wait+degraded, 749 active+clean, 6 active+clean+inconsistent; 557 GB data, 1116 GB used, 3063 GB / 4179 GB avail; 11/287452 objects degraded (0.004%)
2017-09-29 10:49:08.301813  [INF] pgmap v67494432: 764 pgs: 1 active+recovery_wait+degraded+inconsistent, 8 active+recovery_wait+degraded, 749 active+clean, 6 active+clean+inconsistent; 557 GB data, 1116 GB used, 3063 GB / 4179 GB avail; 11/287452 objects degraded (0.004%)



The only solution I found is to go the machine where that OSD lives and reboot it. What I see in the logs:


2017-09-29 10:50:19.800910 7f2a7eb128c0 20 -- 172.16.0.119:6802/5863 wait: stopped accepter thread
2017-09-29 10:50:19.800917 7f2a7eb128c0 20 -- 172.16.0.119:6802/5863 wait: stopping reaper thread
2017-09-29 10:50:19.800937 7f2a74f92700 10 -- 172.16.0.119:6802/5863 reaper_entry done
2017-09-29 10:50:19.801008 7f2a7eb128c0 20 -- 172.16.0.119:6802/5863 wait: stopped reaper thread
2017-09-29 10:50:19.801015 7f2a7eb128c0 10 -- 172.16.0.119:6802/5863 wait: closing pipes
2017-09-29 10:50:19.801017 7f2a7eb128c0 10 -- 172.16.0.119:6802/5863 reaper
2017-09-29 10:50:19.801019 7f2a7eb128c0 10 -- 172.16.0.119:6802/5863 reaper done
2017-09-29 10:50:19.801020 7f2a7eb128c0 10 -- 172.16.0.119:6802/5863 wait: waiting for pipes  to close
2017-09-29 10:50:19.801022 7f2a7eb128c0 10 -- 172.16.0.119:6802/5863 wait: done.
2017-09-29 10:50:19.801024 7f2a7eb128c0  1 -- 172.16.0.119:6802/5863 shutdown complete.
2017-09-29 10:50:19.801026 7f2a7eb128c0 10 -- :/5863 wait: waiting for dispatch queue
2017-09-29 10:50:19.801040 7f2a7eb128c0 10 -- :/5863 wait: dispatch queue is stopped
2017-09-29 10:50:19.801042 7f2a7eb128c0 20 -- :/5863 wait: stopping reaper thread
2017-09-29 10:50:19.801056 7f2a74791700 10 -- :/5863 reaper_entry done
2017-09-29 10:50:19.801088 7f2a7eb128c0 20 -- :/5863 wait: stopped reaper thread
2017-09-29 10:50:19.801094 7f2a7eb128c0 10 -- :/5863 wait: closing pipes
2017-09-29 10:50:19.801095 7f2a7eb128c0 10 -- :/5863 reaper
2017-09-29 10:50:19.801097 7f2a7eb128c0 10 -- :/5863 reaper done
2017-09-29 10:50:19.801098 7f2a7eb128c0 10 -- :/5863 wait: waiting for pipes  to close
2017-09-29 10:50:19.801099 7f2a7eb128c0 10 -- :/5863 wait: done.
2017-09-29 10:50:19.801100 7f2a7eb128c0  1 -- :/5863 shutdown complete.



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux