Hi Chris, Yes, all pools have size=3 and min_size=2. The clients are only RBD. I did a shutdown to make a firmware upgrade. Kr. Luis On 15/07/16 09:05, Christian Balzer
wrote:
Hello, On Fri, 15 Jul 2016 00:28:37 +0200 Luis Ramirez wrote:Hi, I've a cluster with 3 MON nodes and 5 OSD nodes. If i make a reboot of 1 of the osd nodes i get slow request waiting for active. 2016-07-14 19:39:07.996942 osd.33 10.255.128.32:6824/7404 888 : cluster [WRN] slow request 60.627789 seconds old, received at 2016-07-14 19:38:07.369009: osd_op(client.593241.0:3283308 3.d8215fdb (undecoded) ondisk+write+known_if_redirected e11409) currently waiting for active 2016-07-14 19:39:07.996950 osd.33 10.255.128.32:6824/7404 889 : cluster [WRN] slow request 60.623972 seconds old, received at 2016-07-14 19:38:07.372826: osd_op(client.593241.0:3283309 3.d8215fdb (undecoded) ondisk+write+known_if_redirected e11411) currently waiting for active 2016-07-14 19:39:07.996958 osd.33 10.255.128.32:6824/7404 890 : cluster [WRN] slow request 240.631544 seconds old, received at 2016-07-14 19:35:07.365255: osd_op(client.593241.0:3283269 3.d8215fdb (undecoded) ondisk+write+known_if_redirected e11384) currently waiting for active 2016-07-14 19:39:07.996965 osd.33 10.255.128.32:6824/7404 891 : cluster [WRN] slow request 30.625102 seconds old, received at 2016-07-14 19:38:37.371697: osd_op(client.593241.0:3283315 3.d8215fdb (undecoded) ondisk+write+known_if_redirected e11433) currently waiting for active 2016-07-14 19:39:12.997985 osd.33 10.255.128.32:6824/7404 893 : cluster [WRN] 83 slow requests, 4 included below; oldest blocked for > 395.971587 secs And the service will not recover until the node restart sucesffully. Anyone could provide me any light about what i'm doing wrong?First of all, do all your pools have a size=3 and a min_size=2? What kind of clients does your cluster have (RBD images, CephFS, RGW?) How do you reboot that OSD node? Normally when you stop OSDs via their initscript or systemd, they will be removed gracefully and re-peering of clients will start right away before any lengthy timeouts are reached. See this example from my test cluster, the output is from "rados bench" and I did stop all OSDs (via "service ceph stop osd") on one node from second 58. Note that shutting down all 4 OSDs on that node takes about 1-2 seconds each. Then we get about 10 seconds of things sorting themselves out and the things continue normally. No slow request warnings. --- sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 55 31 1022 991 72.0578 116 2.6782 1.74562 56 31 1041 1010 72.128 76 0.955143 1.73901 57 31 1066 1035 72.6166 100 0.972699 1.72883 58 31 1084 1053 72.6058 72 0.549388 1.72471 59 31 1100 1069 72.4597 64 0.75425 1.72927 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 60 31 1118 1087 72.4519 72 2.2628 1.72937 61 31 1131 1100 72.1164 52 2.92359 1.7259 62 31 1141 1110 71.5983 40 1.68941 1.7285 63 31 1149 1118 70.9697 32 1.30379 1.73533 64 31 1153 1122 70.1108 16 3.05046 1.73568 65 31 1156 1125 69.2167 12 2.82071 1.73744 66 31 1158 1127 68.2892 8 3.01163 1.73965 67 31 1158 1127 67.27 0 - 1.73965 68 31 1159 1128 66.3396 2 5.11638 1.74264 69 31 1161 1130 65.4941 8 8.64385 1.75326 70 31 1161 1130 64.5585 0 - 1.75326 71 31 1161 1130 63.6492 0 - 1.75326 72 31 1161 1130 62.7652 0 - 1.75326 73 31 1163 1132 62.015 2 13.7002 1.77289 74 31 1163 1132 61.1769 0 - 1.77289 75 31 1163 1132 60.3613 0 - 1.77289 76 31 1163 1132 59.5671 0 - 1.77289 77 31 1163 1132 58.7935 0 - 1.77289 78 31 1163 1132 58.0397 0 - 1.77289 79 31 1163 1132 57.3051 0 - 1.77289 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 80 31 1163 1132 56.5888 0 - 1.77289 81 31 1163 1132 55.8901 0 - 1.77289 82 31 1163 1132 55.2086 0 - 1.77289 83 31 1163 1132 54.5434 0 - 1.77289 84 31 1163 1132 53.8941 0 - 1.77289 85 31 1164 1133 53.3071 0.333333 22.5502 1.79123 86 31 1170 1139 52.9663 24 21.7306 1.90575 87 31 1174 1143 52.5414 16 26.7337 1.98175 88 31 1184 1153 52.3988 40 1.92565 2.07644 89 31 1189 1158 52.0347 20 1.12557 2.10756 90 31 1201 1170 51.9898 48 0.767024 2.1907 91 31 1214 1183 51.9898 52 0.652047 2.24676 92 31 1227 1196 51.9898 52 28.9226 2.28787 93 31 1240 1209 51.9898 52 32.7307 2.35555 94 31 1261 1230 52.3302 84 0.482482 2.40575 95 31 1283 1252 52.7054 88 1.31267 2.39677 96 31 1300 1269 52.8647 68 0.796716 2.38455 --- Note that with another test via CephFS and a different "rados bench" I was able to create some slow requests, but they cleared up very quickly and definitely did not require any of the OSDs to be brought back up. Christian |
begin:vcard fn;quoted-printable:Luis Ram=C3=ADrez Viejo n;quoted-printable:Ram=C3=ADrez Viejo;Luis org:GSSI S.L. - OpenCloud & OpenSolutions adr:;;c\ Thomas Edison 4 Bl1 Oficina 1526;Rivas VaciaMadrid;Madrid;28521;Spain email;internet:luis.ramirez@xxxxxxxxxxxx title:CEO tel;work:+34914126285 tel;cell:+34682740973 x-mozilla-html:TRUE url:http://www.opencloud.es version:2.1 end:vcard
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com