Hi Krutika, thanks for ypur input. I disabled client-side heal and will monitor if it happens again! Regards Sebastian Von: Krutika Dhananjay [mailto:kdhananj@xxxxxxxxxx] Could you try disabling client-side heals and see if it works for you? Here's what you'd need to do: #gluster volume set <VOL> entry-self-heal off #gluster volume set <VOL> data-self-heal off #gluster volume set <VOL> metadata-self-heal off -Krutika On Wed, Mar 2, 2016 at 12:37 AM, <Sebastian.Gumprich@xxxxxxxxxxxxx> wrote: Hello everyone, I’m experiencing high load on our glusterfs clients. Here’s the setup: There are to glusterfs server: Nfs01 and nfs02 with the following configuration: [root nfs01 ~]# gluster volume info opt Volume Name: opt Type: Replicate Volume ID: 5b77070f-5378-45ec-9eda-5f7dd007ff8a Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: nfs01:/opt/bkk Brick2: nfs02:/opt/bkk Options Reconfigured: performance.readdir-ahead: on performance.quick-read: off performance.cache-size: 512MB performance.cache-refresh-timeout: 10 performance.read-ahead: off performance.write-behind-window-size: 4MB network.ping-timeout: 2 performance.io-thread-count: 16 performance.cache-max-file-size: 2MB performance.md-cache-timeout: 1 Then there are two clients (web01 and web02) that mount the brick via a virtual ip-address (nfs-VIP): nfs-VIP:/opt on /opt/bkk type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072) operating system on all server is CentOS Linux release 7.2.1511 (Core). Glusterfs version is glusterfs 3.7.6 built on Nov 9 2015 15:20:26 On the brick lies the PHP dynamic webcontent from a typo3 CMS. On the client (web01) the following is logged in the gluster.log: iner_08850598886fb5f39c9cf1d269d7e20677f97ede.php>, e09948dd-1e9b-4430-8f55-3df64cda2385 on opt-client-1 and ba80a475-7b83-4c83-bd0c-798a108bfb63 on opt-client-0. Skipping conservative merge on the file. [2016-03-01 18:40:50.570040] E [MSGID: 108008] [afr-self-heal-entry.c:253:afr_selfheal_detect_gfid_and_type_mismatch] 0-opt-replicate-0: Gfid mismatch detected for <4c6dda77-6a2b-4996-bca4-9ace4cee45cc/News_News_layout_Detail_html_bd113d9c433c8f88376e47547db3b94e698a5ecd.php>, 739ee14c-2d5d-458b-bffd-83595bfcbe6a on opt-client-1 and 5a311733-731e-4478-ad3c-a70fbf66ba30 on opt-client-0. Skipping conservative merge on the file. [2016-03-01 18:40:50.572992] E [MSGID: 108008] [afr-self-heal-entry.c:253:afr_selfheal_detect_gfid_and_type_mismatch] 0-opt-replicate-0: Gfid mismatch detected for <4c6dda77-6a2b-4996-bca4-9ace4cee45cc/News_News_partial_Detail_FalMediaContainer_9c1b3fd40fca9019726b3f6b8bc04618ffadab7b.php>, bb6907a1-ce80-4e03-92df-6fbc69d24a4d on opt-client-1 and 6f88aa67-cb81-4c26-94b2-e3aaa8704e8d on opt-client-0. Skipping conservative merge on the file. [2016-03-01 18:40:50.791704] E [MSGID: 108008] [afr-self-heal-entry.c:253:afr_selfheal_detect_gfid_and_type_mismatch] 0-opt-replicate-0: Gfid mismatch detected for <4c6dda77-6a2b-4996-bca4-9ace4cee45cc/Powermail_Form_action_create_f40464a6a7f73d86cda514065167d59a7ddece73.php>, 5e5b224b-ea20-4d38-8504-61b24f5d6a3b on opt-client-1 and fab07af5-2aa5-4873-a6e2-6265ec78e304 on opt-client-0. Skipping conservative merge on the file. [2016-03-01 18:40:54.085964] E [MSGID: 108008] [afr-self-heal-entry.c:253:afr_selfheal_detect_gfid_and_type_mismatch] 0-opt-replicate-0: Gfid mismatch detected for <4c6dda77-6a2b-4996-bca4-9ace4cee45cc/News_News_action_detail_8d30b654cd8343fe40616b8a2f8a5343b1ed776e.php>, 4d75a687-b9ab-4f97-b698-38668d1981ae on opt-client-1 and 110b315e-2e28-4859-a8b9-e0f1629faa3c on opt-client-0. Skipping conservative merge on the file. [2016-03-01 18:40:56.153651] E [MSGID: 108008] [afr-self-heal-entry.c:253:afr_selfheal_detect_gfid_and_type_mismatch] 0-opt-replicate-0: Gfid mismatch detected for <4c6dda77-6a2b-4996-bca4-9ace4cee45cc/Powermail_Form_layout_Default_aae217b167ad82f4b1258bb01fa73f305844dbd8.php>, 6f7e2709-8c14-486a-85a2-a3cb48af4ca5 on opt-client-1 and 6ab62408-0406-4834-96b9-a51e18441d4c on opt-client-0. Skipping conservative merge on the file. [2016-03-01 18:41:05.476126] I [MSGID: 108026] [afr-self-heal-entry.c:593:afr_selfheal_entry_do] 0-opt-replicate-0: performing entry selfheal on 7a922c37-48d0-4dfb-8abb-18a435c948af [2016-03-01 18:41:05.597093] I [MSGID: 108026] [afr-self-heal-common.c:651:afr_log_selfheal] 0-opt-replicate-0: Completed entry selfheal on 7a922c37-48d0-4dfb-8abb-18a435c948af. source=1 sinks=0 [2016-03-01 18:41:05.790944] E [MSGID: 108008] [afr-self-heal-entry.c:253:afr_selfheal_detect_gfid_and_type_mismatch] 0-opt-replicate-0: Gfid mismatch detected for <4c6dda77-6a2b-4996-bca4-9ace4cee45cc/Powermail_Form_action_form_f0755f8526150f023fd98252b510a40c49586dbd.php>, 118668d9-608a-477a-b655-bcc6c2298bf4 on opt-client-1 and a87943a4-e18a-4642-adff-1ad765496533 on opt-client-0. Skipping conservative merge on the file. [2016-03-01 18:41:06.649695] W [MSGID: 108008] [afr-self-heal-name.c:359:afr_selfheal_name_gfid_mismatch_check] 0-opt-replicate-0: GFID mismatch for <gfid:4c6dda77-6a2b-4996-bca4-9ace4cee45cc>/Powermail_Form_action_form_f0755f8526150f023fd98252b510a40c49586dbd.php 118668d9-608a-477a-b655-bcc6c2298bf4 on opt-client-1 and a87943a4-e18a-4642-adff-1ad765496533 on opt-client-0 [2016-03-01 18:41:06.661277] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 184415191: LOOKUP() /releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_action_form_f0755f8526150f023fd98252b510a40c49586dbd.php => -1 (Input/output error) [2016-03-01 18:41:06.680968] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 184422672: LOOKUP() /releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_action_form_f0755f8526150f023fd98252b510a40c49586dbd.php => -1 (Input/output error) [2016-03-01 18:41:06.680222] W [MSGID: 108008] [afr-self-heal-name.c:359:afr_selfheal_name_gfid_mismatch_check] 0-opt-replicate-0: GFID mismatch for <gfid:4c6dda77-6a2b-4996-bca4-9ace4cee45cc>/Powermail_Form_action_form_f0755f8526150f023fd98252b510a40c49586dbd.php 118668d9-608a-477a-b655-bcc6c2298bf4 on opt-client-1 and a87943a4-e18a-4642-adff-1ad765496533 on opt-client-0 There are many more of these entries, this is just a really small excerpt. The files that have a mismatch are tempary php-cache files. When I delete these files, the load goes down and the files in the volume heal info become less (see below). Here’s the output of gluster volume heal opt info. Note that this output is *after* deleting most of the cache files, before that there were many more entries. [root@nfs01 fluid_template]# gluster volume heal opt info Brick nfs01:/opt/bkk <gfid:23fc1027-0aec-4b84-9ffb-c164a9d43d20> <gfid:92cb9dde-2721-4c11-93a6-2582ed9edd5d> <gfid:a0dbcf8a-67f8-4870-ab57-3d5d1218601c> <gfid:947cbcc4-1978-4b9e-b726-2acd0a4fda5a> <gfid:440b9b36-bad5-4cb6-b935-8a004132340a> Number of entries: 5 Brick nfs02:/opt/bkk /releases/1.0.1/typo3temp/Cache/Code/fluid_template - Possibly undergoing heal /releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Form_Select_7bf809152d985037de761d8d375d286e44b4f13a.php /releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Misc_GoogleAdwordsConversion_f7254aeb252ea43cd89f9051b5a43109d47938f1.php /releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Form_File_4b3a3f667c475577847aa77118f3af5666ecb2c6.php /releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_action_form_f0755f8526150f023fd98252b510a40c49586dbd.php /releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Form_Input_b3e08744b23680f0a14e60e716f9994d9580e3f4.php /releases/1.0.1/typo3temp/Cache/Code/fluid_template/News_News_action_detail_8d30b654cd8343fe40616b8a2f8a5343b1ed776e.php /releases/1.0.1/typo3temp/Cache/Code/fluid_template/News_News_layout_Detail_html_bd113d9c433c8f88376e47547db3b94e698a5ecd.php /releases/1.0.1/typo3temp/Cache/Code/fluid_template/News_News_partial_Detail_Opengraph_b98680f3686dccf00e22181e66d11ca9de7a44bd.php /releases/1.0.1/typo3temp/Cache/Code/fluid_template/News_News_partial_Detail_FalMediaContainer_9c1b3fd40fca9019726b3f6b8bc04618ffadab7b.php /releases/1.0.1/typo3temp/Cache/Code/fluid_template/News_News_partial_Detail_MediaContainer_08850598886fb5f39c9cf1d269d7e20677f97ede.php /releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Form_Text_c10766db8d335d5cd9555878aef5d886dcb6926e.php /releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_action_create_f40464a6a7f73d86cda514065167d59a7ddece73.php /releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_layout_Default_aae217b167ad82f4b1258bb01fa73f305844dbd8.php /releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Misc_HoneyPod_fc83c414f744612c3cb44c8827372a30f17791d0.php /releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Form_Textarea_39d24d8e3e2813636dfff2a89b7cefb8e9117c97.php /releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Form_Submit_86e69c50ccebf20584db2e3c74859373c53d320f.php /releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Form_Check_a2a11c64ac58dab16eab29e4cda88518c15a4d25.php /releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Misc_FormError_7cade8e8fc1d23c761360c0efbe8cb145eed2e39.php /releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Form_Radio_d038a263b5ea81f0f7795e1e47516e5e2937cbd9.php /releases/1.0.1/typo3temp/Cache/Code/fluid_template/Powermail_Form_partial_Form_Hidden_a7651f5498e0d36b4e2eae5fca015ef0e9365067.php Number of entries: 21 Here are some heal-infos during the high load: Starting time of crawl: Tue Mar 1 19:09:45 2016 Ending time of crawl: Tue Mar 1 19:09:52 2016 Type of crawl: INDEX No. of entries healed: 2 No. of entries in split-brain: 0 No. of heal failed entries: 168 And here’s the performance monitoring info during 60 seconds of high load: Brick: nfs01:/opt/bkk ------------------------------ Cumulative Stats: Block Size: 1b+ 2b+ 4b+ No. of Reads: 18 33 308 No. of Writes: 64 88 2994 Block Size: 8b+ 16b+ 32b+ No. of Reads: 117 154 15612 No. of Writes: 370 369 1432 Block Size: 64b+ 128b+ 256b+ No. of Reads: 3721 12884 19917 No. of Writes: 7585 900221 135011 Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 23929 12251 19835 No. of Writes: 63067 30950 23540 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 9096 9449 5566 No. of Writes: 40455 36397 13926 Block Size: 32768b+ 65536b+ 131072b+ No. of Reads: 5159 6055 34001 No. of Writes: 20722 6600 12762 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 212065 FORGET 0.00 0.00 us 0.00 us 0.00 us 4118713 RELEASE 0.00 0.00 us 0.00 us 0.00 us 9931097 RELEASEDIR 0.00 47.00 us 47.00 us 47.00 us 1 GETXATTR 0.00 124.00 us 124.00 us 124.00 us 1 XATTROP 0.00 144.00 us 144.00 us 144.00 us 1 UNLINK 0.00 37.17 us 35.00 us 41.00 us 6 STATFS 0.01 36.84 us 32.00 us 61.00 us 19 FSTAT 0.01 48.80 us 44.00 us 62.00 us 20 STAT 0.05 75.52 us 54.00 us 156.00 us 48 REMOVEXATTR 0.05 81.69 us 69.00 us 146.00 us 48 SETATTR 0.05 37.16 us 14.00 us 436.00 us 109 FLUSH 0.05 38.31 us 18.00 us 115.00 us 108 FINODELK 0.06 73.16 us 46.00 us 170.00 us 63 OPEN 0.10 38.70 us 20.00 us 167.00 us 195 INODELK 0.11 335.37 us 40.00 us 563.00 us 27 READDIR 0.15 50.58 us 27.00 us 392.00 us 232 OPENDIR 0.15 81.74 us 35.00 us 215.00 us 144 FXATTROP 0.16 130.53 us 68.00 us 697.00 us 100 WRITE 0.93 1532.04 us 171.00 us 15356.00 us 48 CREATE 1.40 284.90 us 25.00 us 1146.00 us 390 READDIRP 10.20 52.29 us 28.00 us 2323.00 us 15482 READLINK 18.94 33.25 us 11.00 us 27242.00 us 45219 ENTRYLK 67.58 93.74 us 32.00 us 27521.00 us 57223 LOOKUP Duration: 6492593 seconds Data Read: 5679496942 bytes Data Written: 4510536316 bytes Interval 1 Stats: Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 0 0 0 No. of Writes: 1 50 2 Block Size: 32768b+ No. of Reads: 0 No. of Writes: 47 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 48 FORGET 0.00 0.00 us 0.00 us 0.00 us 104 RELEASE 0.00 0.00 us 0.00 us 0.00 us 231 RELEASEDIR 0.00 124.00 us 124.00 us 124.00 us 1 XATTROP 0.00 144.00 us 144.00 us 144.00 us 1 UNLINK 0.00 36.40 us 35.00 us 37.00 us 5 STATFS 0.01 36.84 us 32.00 us 61.00 us 19 FSTAT 0.01 48.80 us 44.00 us 62.00 us 20 STAT 0.05 75.52 us 54.00 us 156.00 us 48 REMOVEXATTR 0.05 37.69 us 14.00 us 436.00 us 101 FLUSH 0.06 81.69 us 69.00 us 146.00 us 48 SETATTR 0.06 74.26 us 46.00 us 170.00 us 53 OPEN 0.06 38.31 us 18.00 us 115.00 us 108 FINODELK 0.10 311.33 us 40.00 us 563.00 us 24 READDIR 0.11 38.70 us 20.00 us 167.00 us 195 INODELK 0.16 50.58 us 27.00 us 392.00 us 231 OPENDIR 0.17 81.74 us 35.00 us 215.00 us 144 FXATTROP 0.18 130.53 us 68.00 us 697.00 us 100 WRITE 1.03 1532.04 us 171.00 us 15356.00 us 48 CREATE 1.56 284.90 us 25.00 us 1146.00 us 390 READDIRP 11.31 52.27 us 28.00 us 2323.00 us 15395 READLINK 18.24 33.67 us 11.00 us 27242.00 us 38571 ENTRYLK 66.83 94.49 us 32.00 us 27521.00 us 50338 LOOKUP Duration: 68 seconds Data Read: 0 bytes Data Written: 3347998 bytes Brick: nfs02:/opt/bkk ------------------------------ Cumulative Stats: Block Size: 1b+ 2b+ 4b+ No. of Reads: 26 49 541 No. of Writes: 64 94 3848 Block Size: 8b+ 16b+ 32b+ No. of Reads: 218 205 1267 No. of Writes: 452 417 1448 Block Size: 64b+ 128b+ 256b+ No. of Reads: 6097 39042 11111 No. of Writes: 8617 924503 136768 Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 120819 37802 16506 No. of Writes: 64399 35996 24999 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 76162 20449 10948 No. of Writes: 41302 37488 14034 Block Size: 32768b+ 65536b+ 131072b+ No. of Reads: 7733 7306 31648 No. of Writes: 20849 6750 12886 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 231622 FORGET 0.00 0.00 us 0.00 us 0.00 us 6123626 RELEASE 0.00 0.00 us 0.00 us 0.00 us 10869781 RELEASEDIR 0.00 116.00 us 116.00 us 116.00 us 1 XATTROP 0.00 40.00 us 38.00 us 43.00 us 6 STATFS 0.01 40.85 us 33.00 us 97.00 us 13 FSTAT 0.01 46.26 us 29.00 us 100.00 us 23 STAT 0.04 73.96 us 53.00 us 150.00 us 48 REMOVEXATTR 0.04 76.85 us 61.00 us 99.00 us 48 SETATTR 0.04 35.83 us 28.00 us 155.00 us 103 UNLINK 0.04 36.28 us 15.00 us 142.00 us 112 FINODELK 0.05 38.68 us 13.00 us 220.00 us 133 FLUSH 0.09 324.11 us 28.00 us 589.00 us 28 READDIR 0.12 85.88 us 36.00 us 215.00 us 144 FXATTROP 0.12 124.55 us 76.00 us 192.00 us 100 WRITE 0.13 78.76 us 19.00 us 529.00 us 161 GETXATTR 0.20 78.65 us 43.00 us 384.00 us 261 OPEN 0.23 54.09 us 2.00 us 260.00 us 426 OPENDIR 0.60 1261.23 us 174.00 us 10655.00 us 48 CREATE 0.66 81.15 us 17.00 us 9254.00 us 819 INODELK 1.21 279.97 us 23.00 us 1587.00 us 434 READDIRP 8.10 52.61 us 27.00 us 1283.00 us 15496 READLINK 15.48 34.54 us 10.00 us 13810.00 us 45133 ENTRYLK 72.84 104.30 us 14.00 us 14613.00 us 70322 LOOKUP Duration: 6492593 seconds Data Read: 6308054987 bytes Data Written: 4579768980 bytes Interval 1 Stats: Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 0 0 0 No. of Writes: 1 50 2 Block Size: 32768b+ No. of Reads: 0 No. of Writes: 47 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 48 FORGET 0.00 0.00 us 0.00 us 0.00 us 286 RELEASE 0.00 0.00 us 0.00 us 0.00 us 400 RELEASEDIR 0.00 116.00 us 116.00 us 116.00 us 1 XATTROP 0.00 40.40 us 38.00 us 43.00 us 5 STATFS 0.01 40.85 us 33.00 us 97.00 us 13 FSTAT 0.01 46.14 us 29.00 us 100.00 us 22 STAT 0.04 73.96 us 53.00 us 150.00 us 48 REMOVEXATTR 0.04 76.85 us 61.00 us 99.00 us 48 SETATTR 0.04 35.83 us 28.00 us 155.00 us 103 UNLINK 0.05 36.28 us 15.00 us 142.00 us 112 FINODELK 0.05 39.64 us 13.00 us 220.00 us 117 FLUSH 0.10 330.28 us 28.00 us 589.00 us 25 READDIR 0.14 85.88 us 36.00 us 215.00 us 144 FXATTROP 0.14 124.55 us 76.00 us 192.00 us 100 WRITE 0.14 79.13 us 19.00 us 529.00 us 159 GETXATTR 0.22 79.51 us 43.00 us 384.00 us 235 OPEN 0.25 54.27 us 2.00 us 260.00 us 400 OPENDIR 0.70 1261.23 us 174.00 us 10655.00 us 48 CREATE 0.77 81.15 us 17.00 us 9254.00 us 819 INODELK 1.26 283.31 us 23.00 us 1587.00 us 386 READDIRP 7.67 52.88 us 27.00 us 1283.00 us 12595 READLINK 15.47 34.83 us 10.00 us 13810.00 us 38576 ENTRYLK 72.91 105.05 us 14.00 us 14613.00 us 60295 LOOKUP Duration: 68 seconds Data Read: 0 bytes Data Written: 3347998 bytes Can anybody tell me how to fix the problem with the high load and these cache files? Thanks in advance! Regards Sebastian
|
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users