Hello,
As Krutika said, I resolved with success all split-brains (more than 3450) appeared after the first data transfert from one backup server to my new and fresh volume but…
The following step to validate my new volume was to enable the quota on it; and now, more than one day after this activation, all the results are still completely wrong: Example: # df -h /home/sterpone_team Filesystem Size Used Avail Use% Mounted on ib-storage1:vol_home.tcp 14T 3,3T 11T 24% /home # pdsh -w storage[1,3] du -sh /export/brick_home/brick{1,2}/data/sterpone_team storage3: 2,5T /export/brick_home/brick1/data/sterpone_team storage3: 2,4T /export/brick_home/brick2/data/sterpone_team storage1: 2,7T /export/brick_home/brick1/data/sterpone_team storage1: 2,4T /export/brick_home/brick2/data/sterpone_team As you can read, all data for this account is around 10TB and quota displays only 3.3TB used.
Worse: # pdsh -w storage[1,3] du -sh /export/brick_home/brick{1,2}/data/baaden_team storage3: 2,9T /export/brick_home/brick1/data/baaden_team storage3: 2,7T /export/brick_home/brick2/data/baaden_team storage1: 3,2T /export/brick_home/brick1/data/baaden_team storage1: 2,8T /export/brick_home/brick2/data/baaden_team # df -h /home/baaden_team/ Filesystem Size Used Avail Use% Mounted on ib-storage1:vol_home.tcp 20T 786G 20T 4% /home # gluster volume quota vol_home list /baaden_team Path Hard-limit Soft-limit Used Available Soft-limit exceeded? Hard-limit exceeded? --------------------------------------------------------------------------------------------------------------------------- /baaden_team 20.0TB 80% 785.6GB 19.2TB No No This account is around 11.6TB and quota detects only 786GB used…
Can someone help me to fix it -knowing if I've previously updated GlusterFS from 3.5.3 to 3.7.2 it was exactly to solve a similar trouble…
For information, in quotad log file: [2015-07-31 22:13:00.574361] I [MSGID: 114047] [client-handshake.c:1225:client_setvolume_cbk] 0-vol_home-client-7: Server and Client lk-version numbers are not same, reopening the fds [2015-07-31 22:13:00.574507] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-vol_home-client-7: Server lk version = 1
is there any causal connection (client/server version conflict)?
Here what i noticed on my /var/log/glusterfs/quota-mount-vol_home.log file: … <same kind of lines> [2015-07-31 21:26:15.247269] I [rpc-clnt.c:1819:rpc_clnt_reconfig] 0-vol_home-client-5: changing port to 49162 (from 0) [2015-07-31 21:26:15.250272] E [socket.c:2332:socket_connect_finish] 0-vol_home-client-5: connection to 10.0.4.2:49162 failed (Connexion refusée) [2015-07-31 21:26:19.250545] I [rpc-clnt.c:1819:rpc_clnt_reconfig] 0-vol_home-client-5: changing port to 49162 (from 0) [2015-07-31 21:26:19.253643] E [socket.c:2332:socket_connect_finish] 0-vol_home-client-5: connection to 10.0.4.2:49162 failed (Connexion refusée) … <same kind of lines>
<A few minutes after:> OK, this was due to one brick which was down. It’s strange: since I have updated GlusteFS to 3.7.x I notice a lot of bricks which go down, sometimes a few moment after starting the volume, sometime after a couple of days/weeks… What never happened with GlusterFS version 3.3.1 and 3.5.3.
Now, I need to stop-and-start the volume because I notice again some hangs with "gluster volume quota … ", "df", etc. One more time, i’ve never noticed this kind of hangs with previous versions of GlusterFS I used; is it "expected"?
One more time: thank you very much by advance.
------------------------------------------------------ Geoffrey Letessier Responsable informatique & ingénieur système UPR 9080 - CNRS - Laboratoire de Biochimie Théorique Institut de Biologie Physico-Chimique 13, rue Pierre et Marie Curie - 75005 Paris Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx
On Wed, Jul 29, 2015 at 12:44:38AM +0200, Geoffrey Letessier wrote:
OK, thank you Niels for this explanation. Now, this makes sense.
And concerning all split-brains appeared during the back-transfert, do you have an idea where is this coming from?
Sorry, no, I dont know how that is happening in your environment. I'll try to find someone that understands more about it and can help you with that.
Niels
Best, Geoffrey ------------------------------------------------------ Geoffrey Letessier Responsable informatique & ingénieur système UPR 9080 - CNRS - Laboratoire de Biochimie Théorique Institut de Biologie Physico-Chimique 13, rue Pierre et Marie Curie - 75005 Paris Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx
Le 29 juil. 2015 à 00:02, Niels de Vos <ndevos@xxxxxxxxxx> a écrit :
On Tue, Jul 28, 2015 at 03:46:37PM +0200, Geoffrey Letessier wrote:
Hi,
In addition of all split brains reported, is it normal to notice thousands and thousands (several tens nay hundreds of thousands) broken symlinks browsing the .glusterfs directory on each brick?
Yes, I think it is normal. A symlink points to a particular filename, possibly in a different directory. If the target file is located on a different brick, the symlink points to a non-local file.
Consider this example with two bricks in a distributed volume: - file: README - symlink: IMPORTANT -> README
When the distribution algorithm is done, README 'hashes' to brick-A. The symlink 'hashes' to brick-B. This means that README will be localed on brick-A, and the symlink with name IMPORTANT would be located on brick-B. Because README is not on the same brick as IMPORTANT, the symlink points to the non-existing file README on brick-B.
However, when a Gluster client reads the target of symlink IMPORTANT, the Gluster client calculate the location of README and will know that README can be found on brick-A.
I hope that makes sense?
Niels
For the moment, i just synchronized one remote directory (around 30TB and a few million files) into my new volume. No other operations on files on this volume has yet been done. How can I fix it? Can I delete these dead-symlinks? How can I fix all my split-brains?
Here is an example of a ls: [root@cl-storage3 ~]# cd /export/brick_home/brick1/data/.glusterfs/7b/d2/ [root@cl-storage3 d2]# ll total 8,7M 13706 drwx------ 2 root root 8,0K 26 juil. 17:22 . 2147483784 drwx------ 258 root root 8,0K 20 juil. 23:07 .. 2148444137 -rwxrwxrwx 2 baaden baaden_team 173K 22 mai 2008 7bd200dd-1774-4395-9065-605ae30ec18b 1559384 -rw-rw-r-- 2 tarus amyloid_team 4,3K 19 juin 2013 7bd2155c-7a05-4edc-ae77-35ed7e16afbc 287295 lrwxrwxrwx 1 root root 58 20 juil. 23:38 7bd2370a-100b-411e-89a4-d184da9f0f88 -> ../../a7/59/a759de6f-cdf5-43dd-809a-baf81d103bf7/prop-base 2149090201 -rw-rw-r-- 2 tarus amyloid_team 76K 8 mars 2014 7bd2497f-d24b-4b19-a1c5-80a4956e56a1 2148561174 -rw-r--r-- 2 tran derreumaux_team 575 14 févr. 07:54 7bd25db0-67f5-43e5-a56a-52cf8c4c60dd 1303943 -rw-r--r-- 2 tran derreumaux_team 576 10 févr. 06:06 7bd25e97-18be-4faf-b122-5868582b4fd8 1308607 -rw-r--r-- 2 tran derreumaux_team 414K 16 juin 11:05 7bd2618f-950a-4365-a753-723597ef29f5 45745 -rw-r--r-- 2 letessier admin_team 585 5 janv. 2012 7bd265c7-e204-4ee8-8717-e4a0c393fb0f 2148144918 -rw-rw-r-- 2 tarus amyloid_team 107K 28 févr. 2014 7bd26c5b-d48a-481a-9ca6-2dc27768b5ad 13705 -rw-rw-r-- 2 tarus amyloid_team 25K 4 juin 2014 7bd27e4c-46ba-4f21-a766-389bfa52fd78 1633627 -rw-rw-r-- 2 tarus amyloid_team 75K 12 mars 2014 7bd28631-90af-4c16-8ff0-c3d46d5026c6 1329165 -rw-r--r-- 2 tran derreumaux_team 175 15 juin 23:40 7bd2957e-a239-4110-b3d8-b4926c7f060b 797803 lrwxrwxrwx 2 baaden baaden_team 26 2 avril 2007 7bd29933-1c80-4c6b-ae48-e64e4da874cb -> ../divided/a7/2a7o.pdb1.gz 1532463 -rw-rw-rw- 2 baaden baaden_team 1,8M 2 nov. 2009 7bd29d70-aeb4-4eca-ac55-fae2d46ba911 1411112 -rw-r--r-- 2 sterpone sterpone_team 3,1K 2 mai 2012 7bd2a5eb-62a4-47fc-b149-31e10bd3c33d 2148865896 -rw-r--r-- 2 tran derreumaux_team 2,1M 15 juin 23:46 7bd2ae9c-18ca-471f-a54a-6e4aec5aea89 2148762578 -rw-rw-r-- 2 tarus amyloid_team 154K 11 mars 2014 7bd2b7d7-7745-4842-b7b4-400791c1d149 149216 -rw-r--r-- 2 vamparys sacquin_team 241K 17 mai 2013 7bd2ba98-6a42-40ea-87ea-acb607d73cb5 2148977923 -rwxr-xr-x 2 murail baaden_team 23K 18 juin 2012 7bd2cf57-19e7-451c-885d-fd02fd988d43 1176623 -rw-rw-r-- 2 tarus amyloid_team 227K 8 mars 2014 7bd2d92c-7ec8-4af8-9043-49d1908a99dc 1172122 lrwxrwxrwx 2 sterpone sterpone_team 61 17 avril 12:49 7bd2d96e-e925-45f0-a26a-56b95c084122 -> ../../../../../src/libs/ck-libs/ParFUM-Tops-Dev/ParFUM_TOPS.h 1385933 -rw-r--r-- 2 tran derreumaux_team 2,9M 16 juin 05:29 7bd2df54-17d2-4644-96b7-f8925a67ec1e 745899 lrwxrwxrwx 1 root root 58 22 juil. 09:50 7bd2df83-ce58-4a17-aca8-a32b71e953d4 -> ../../5c/39/5c39010f-fa77-49df-8df6-8d72cf74fd64/model_009 2149100186 -rw-rw-r-- 2 tarus amyloid_team 494K 17 mars 2014 7bd2e865-a2f4-4d90-ab29-dccebe2e3440
Best. Geoffrey ------------------------------------------------------ Geoffrey Letessier Responsable informatique & ingénieur système UPR 9080 - CNRS - Laboratoire de Biochimie Théorique Institut de Biologie Physico-Chimique 13, rue Pierre et Marie Curie - 75005 Paris Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx
Le 27 juil. 2015 à 22:57, Geoffrey Letessier <geoffrey.letessier@xxxxxxx> a écrit :
Dears,
For a couple of weeks (more than one month), our computing production is stopped due to several -but amazing- troubles with GlusterFS.
After having noticed a big problem with incorrect quota size accounted for many many files, i decided under the guidance of Gluster team support to upgrade my storage cluster from version 3.5.3 to the latest (3.7.2-3) because these bugs are theoretically fixed in this branch. Now, since i’ve done this upgrade, it’s the amazing mess and i cannot restart the production. Indeed : 1 - RDMA protocol is not working and hang my system / shell commands; only TCP protocol (over Infiniband) is more or less operational - it’s not a blocking point but… 2 - read/write performance relatively low 3 - thousands split-brains are appeared.
So, for the moment, i believe GlusterFS 3.7 is not actually production ready.
Concerning the third point: after having destroy all my volumes (RAID re-init, new partition, GlusterFS volumes, etc.), recreate the main one, I tried to back-transfert my data from archive/backup server info this new volume and I note a lot of errors in my mount log file, as your can read in this extract: [2015-07-26 22:35:16.962815] I [afr-self-heal-entry.c:565:afr_selfheal_entry_do] 0-vol_home-replicate-0: performing entry selfheal on 865083fa-984e-44bd-aacf-b8195789d9e0 [2015-07-26 22:35:16.965896] E [afr-self-heal-entry.c:249:afr_selfheal_detect_gfid_and_type_mismatch] 0-vol_home-replicate-0: Gfid mismatch detected for <865083fa-984e-44bd-aacf-b8195789d9e0/job.pbs>, e944d444-66c5-40a4-9603-7c190ad86013 on vol_home-client-1 and 820f9bcc-a0f6-40e0-bcec-28a76b4195ea on vol_home-client-0. Skipping conservative merge on the file. [2015-07-26 22:35:16.975206] I [afr-self-heal-entry.c:565:afr_selfheal_entry_do] 0-vol_home-replicate-0: performing entry selfheal on 29382d8d-c507-4d2e-b74d-dbdcb791ca65 [2015-07-26 22:35:28.719935] E [afr-self-heal-entry.c:249:afr_selfheal_detect_gfid_and_type_mismatch] 0-vol_home-replicate-0: Gfid mismatch detected for <29382d8d-c507-4d2e-b74d-dbdcb791ca65/res_1BVK_r_u_1IBR_l_u_Cond.1IBR_l_u.1BVK_r_u.UB.global.dat.txt>, 951c5ffb-ca38-4630-93f3-8e4119ab0bd8 on vol_home-client-1 and 5ae663ca-e896-4b92-8ec5-5b15422ab861 on vol_home-client-0. Skipping conservative merge on the file. [2015-07-26 22:35:29.764891] I [afr-self-heal-entry.c:565:afr_selfheal_entry_do] 0-vol_home-replicate-0: performing entry selfheal on 865083fa-984e-44bd-aacf-b8195789d9e0 [2015-07-26 22:35:29.768339] E [afr-self-heal-entry.c:249:afr_selfheal_detect_gfid_and_type_mismatch] 0-vol_home-replicate-0: Gfid mismatch detected for <865083fa-984e-44bd-aacf-b8195789d9e0/job.pbs>, e944d444-66c5-40a4-9603-7c190ad86013 on vol_home-client-1 and 820f9bcc-a0f6-40e0-bcec-28a76b4195ea on vol_home-client-0. Skipping conservative merge on the file. [2015-07-26 22:35:29.775037] I [afr-self-heal-entry.c:565:afr_selfheal_entry_do] 0-vol_home-replicate-0: performing entry selfheal on 29382d8d-c507-4d2e-b74d-dbdcb791ca65 [2015-07-26 22:35:29.776857] E [afr-self-heal-entry.c:249:afr_selfheal_detect_gfid_and_type_mismatch] 0-vol_home-replicate-0: Gfid mismatch detected for <29382d8d-c507-4d2e-b74d-dbdcb791ca65/res_1BVK_r_u_1IBR_l_u_Cond.1IBR_l_u.1BVK_r_u.UB.global.dat.txt>, 951c5ffb-ca38-4630-93f3-8e4119ab0bd8 on vol_home-client-1 and 5ae663ca-e896-4b92-8ec5-5b15422ab861 on vol_home-client-0. Skipping conservative merge on the file. [2015-07-26 22:35:29.800535] W [MSGID: 108008] [afr-self-heal-name.c:353:afr_selfheal_name_gfid_mismatch_check] 0-vol_home-replicate-0: GFID mismatch for <gfid:29382d8d-c507-4d2e-b74d-dbdcb791ca65>/res_1BVK_r_u_1IBR_l_u_Cond.1IBR_l_u.1BVK_r_u.UB.global.dat.txt 951c5ffb-ca38-4630-93f3-8e4119ab0bd8 on vol_home-client-1 and 5ae663ca-e896-4b92-8ec5-5b15422ab861 on vol_home-client-0
And when I try to browse some folders (still in mount log file): [2015-07-27 09:00:19.005763] I [afr-self-heal-entry.c:565:afr_selfheal_entry_do] 0-vol_home-replicate-0: performing entry selfheal on 2ac27442-8be0-4985-b48f-3328a86a6686 [2015-07-27 09:00:22.322316] E [afr-self-heal-entry.c:249:afr_selfheal_detect_gfid_and_type_mismatch] 0-vol_home-replicate-0: Gfid mismatch detected for <2ac27442-8be0-4985-b48f-3328a86a6686/md0012588.gro>, 9c635868-054b-4a13-b974-0ba562991586 on vol_home-client-1 and 1943175c-b336-4b33-aa1c-74a1c51f17b9 on vol_home-client-0. Skipping conservative merge on the file. [2015-07-27 09:00:23.008771] I [afr-self-heal-entry.c:565:afr_selfheal_entry_do] 0-vol_home-replicate-0: performing entry selfheal on 2ac27442-8be0-4985-b48f-3328a86a6686 [2015-07-27 08:59:50.359187] W [MSGID: 108008] [afr-self-heal-name.c:353:afr_selfheal_name_gfid_mismatch_check] 0-vol_home-replicate-0: GFID mismatch for <gfid:2ac27442-8be0-4985-b48f-3328a86a6686>/md0012588.gro 9c635868-054b-4a13-b974-0ba562991586 on vol_home-client-1 and 1943175c-b336-4b33-aa1c-74a1c51f17b9 on vol_home-client-0 [2015-07-27 09:00:02.500419] W [MSGID: 108008] [afr-self-heal-name.c:353:afr_selfheal_name_gfid_mismatch_check] 0-vol_home-replicate-0: GFID mismatch for <gfid:2ac27442-8be0-4985-b48f-3328a86a6686>/md0012590.gro b22aec09-2be3-41ea-a976-7b8d0e6f61f0 on vol_home-client-1 and ec100f9e-ec48-4b29-b75e-a50ec6245de6 on vol_home-client-0 [2015-07-27 09:00:02.506925] W [MSGID: 108008] [afr-self-heal-name.c:353:afr_selfheal_name_gfid_mismatch_check] 0-vol_home-replicate-0: GFID mismatch for <gfid:2ac27442-8be0-4985-b48f-3328a86a6686>/md0009059.gro 0485c093-11ca-4829-b705-e259668ebd8c on vol_home-client-1 and e83a492b-7f8c-4b32-a76e-343f984142fe on vol_home-client-0 [2015-07-27 09:00:23.001121] W [MSGID: 108008] [afr-read-txn.c:241:afr_read_txn] 0-vol_home-replicate-0: Unreadable subvolume -1 found with event generation 2. (Possible split-brain) [2015-07-27 09:00:26.231262] E [afr-self-heal-entry.c:249:afr_selfheal_detect_gfid_and_type_mismatch] 0-vol_home-replicate-0: Gfid mismatch detected for <2ac27442-8be0-4985-b48f-3328a86a6686/md0012588.gro>, 9c635868-054b-4a13-b974-0ba562991586 on vol_home-client-1 and 1943175c-b336-4b33-aa1c-74a1c51f17b9 on vol_home-client-0. Skipping conservative merge on the file.
And, above all, browsing folder I get a lot of input/ouput errors.
Currently I have 6.2M inodes and roughly 30TB in my "new" volume.
For the moment, Quota is disable to increase the IO performance during the back-transfert…
Your can also find in attachments: - an "ls" result - a split-brain research result - the volume information and status - a complete volume heal info
Hoping this can help your to help me to fix all my problems and reopen the computing production.
Thanks in advance, Geoffrey
PS: « Erreur d’Entrée/Sortie » = « Input / Output Error » ------------------------------------------------------ Geoffrey Letessier Responsable informatique & ingénieur système UPR 9080 - CNRS - Laboratoire de Biochimie Théorique Institut de Biologie Physico-Chimique 13, rue Pierre et Marie Curie - 75005 Paris Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx
<ls_example.txt> <split_brain__20150725.txt> <vol_home_healinfo.txt> <vol_home_info.txt> <vol_home_status.txt>
|