Hi Ravi, thanks for replying. I've checked all bricks for their respective gfid files but either the files don't exist or getfattr produces no output. What I've also found is that the gfid list shown for the 3 bricks that stayed up contains the same list of entries, albeit not all in the same order. On top of that, the list of files that are present or not is the same on the 3 bricks that remained up. I created a "gfids" file in /root on each brick, which contained that brick's <gfid:*> lines reported by gluster volume callrec heal info. The output of "md5sum gfids" differed between the 3 "up" bricks, but the output of "sort gfids | md5sum" was the same. I then wrote this Perl script: #!/usr/bin/perl use strict; use warnings; while(<>=~/gfid:((\w\w)(\w\w)[\w\-]+)/) { my $path="/data/brick/callrec/.glusterfs/$2/$3/$1"; print "$path\t: "; if (-f $path) { chomp (my @output=`getfattr $path`); @output = 'no xattrs' unless @output; print map { "$_\n" } @output; } else { print "file not found\n"; } } Then, the following command produced the same output on each brick: sort gfids | perl gfid-to-fattr.pl | md5sum The full output: [root@gluster2b-1 ~]# perl gfid-to-fattr.pl < gfids /data/brick/callrec/.glusterfs/d6/e9/d6e91662-a395-42de-8af3-ba6164eb0f90 : file not found /data/brick/callrec/.glusterfs/09/87/09872041-bc68-4eee-9c4d-ef11e4137f66 : no xattrs /data/brick/callrec/.glusterfs/48/85/4885e073-75fd-461a-a70e-1e9578ef99cd : file not found /data/brick/callrec/.glusterfs/0d/44/0d4431f1-10b1-4339-96a9-d8370bbd77a7 : no xattrs /data/brick/callrec/.glusterfs/8d/05/8d059d2b-9e98-482b-aeac-e5decd126fe7 : file not found /data/brick/callrec/.glusterfs/b2/3a/b23a64dd-4522-408c-8465-4096c2977da4 : no xattrs /data/brick/callrec/.glusterfs/27/63/27632fb9-ab59-45cb-bb7a-5c78697619ea : file not found /data/brick/callrec/.glusterfs/8d/f8/8df8af4d-38cf-4cbd-b68e-4a6f2d41b308 : no xattrs /data/brick/callrec/.glusterfs/5c/b5/5cb576bb-8389-4f34-b812-5a770e4c0013 : file not found /data/brick/callrec/.glusterfs/65/39/6539988f-2fe6-4500-aa7c-a7cd7cfcc5df : no xattrs /data/brick/callrec/.glusterfs/3d/cc/3dcce45c-8c06-45f6-8101-a2e9737441c5 : file not found /data/brick/callrec/.glusterfs/de/26/de268a4d-06fd-479d-8efc-954002ff7222 : no xattrs /data/brick/callrec/.glusterfs/c2/37/c2370b73-7c1c-4f94-afcf-74755b9a68a0 : file not found /data/brick/callrec/.glusterfs/fa/7b/fa7bbd79-3394-49b1-96c1-0bf7d3b3138d : file not found /data/brick/callrec/.glusterfs/88/0c/880cd05b-46e5-4cf4-b042-bdb729acdd56 : no xattrs /data/brick/callrec/.glusterfs/1b/ad/1badcc6d-52e5-42ca-a42a-3877415cf21e : file not found /data/brick/callrec/.glusterfs/45/93/459313b5-7ac8-43c6-a3b8-3a8e80afcebd : no xattrs /data/brick/callrec/.glusterfs/db/53/db53921d-ca03-4c97-95cd-9fa4b514d585 : file not found /data/brick/callrec/.glusterfs/87/81/8781eb74-7409-4b1b-aa69-e30bf9c2387d : no xattrs /data/brick/callrec/.glusterfs/21/c1/21c1bcc5-4c49-4010-9b4f-e920634764be : file not found /data/brick/callrec/.glusterfs/70/0e/700e2b0a-e6c0-4888-a5ea-f52cac55e770 : no xattrs /data/brick/callrec/.glusterfs/7a/1b/7a1b72f5-ca99-4ede-aacd-ba1b867959c0 : file not found /data/brick/callrec/.glusterfs/ac/75/ac75a8e5-0906-40ea-a2ab-145f3cfcce2e : no xattrs /data/brick/callrec/.glusterfs/c6/03/c603b955-305b-421e-9056-6a8256c25f88 : file not found /data/brick/callrec/.glusterfs/81/e9/81e9ab2f-3f36-4a1f-ba2a-86577501c5db : file not found /data/brick/callrec/.glusterfs/3f/9f/3f9fcead-ffbe-4dd0-9dcf-8e815dfdc5b4 : no xattrs /data/brick/callrec/.glusterfs/09/12/0912f760-51ce-48bd-93b1-fffbebd965a1 : file not found /data/brick/callrec/.glusterfs/66/a4/66a4001b-9456-44e2-b547-5036383500d8 : no xattrs /data/brick/callrec/.glusterfs/8b/ce/8bcece86-49af-4d88-83f8-8672dbdc5ed1 : file not found /data/brick/callrec/.glusterfs/f6/c7/f6c76172-c73b-4cf0-ad8a-00a6f9c6d7d2 : no xattrs /data/brick/callrec/.glusterfs/b1/db/b1db87aa-c1fa-409b-aa10-429860c20dbe : no xattrs /data/brick/callrec/.glusterfs/e9/7c/e97cb502-924d-4ea8-9730-f84eda0b69fd : no xattrs /data/brick/callrec/.glusterfs/e7/21/e721ce14-9201-4ec4-a027-bbeef43ab401 : file not found /data/brick/callrec/.glusterfs/a6/02/a6024ac9-fdde-4517-87e8-ede5845f3bb3 : file not found /data/brick/callrec/.glusterfs/bc/3c/bc3c227c-a68e-4b9e-a81f-ce14aa30a504 : file not found /data/brick/callrec/.glusterfs/a1/94/a194006c-b0c6-49e9-a2ec-3613403f869f : no xattrs /data/brick/callrec/.glusterfs/c6/10/c610a4ae-1bbb-464d-8ef3-4c50a64f1110 : file not found /data/brick/callrec/.glusterfs/66/4f/664f979e-5acc-42f1-93be-9a46d783d430 : no xattrs /data/brick/callrec/.glusterfs/41/97/4197cd59-1cec-452e-8c3c-c212ea9f17f6 : no xattrs /data/brick/callrec/.glusterfs/1e/2b/1e2b6e9e-8d19-48d8-9488-4d3f78e38213 : file not found /data/brick/callrec/.glusterfs/0b/95/0b95c327-d880-4e93-9432-3fd205b46ca4 : file not found /data/brick/callrec/.glusterfs/ae/44/ae447d7b-04b9-414e-ada6-d1880a5c6555 : no xattrs /data/brick/callrec/.glusterfs/a2/9c/a29c32b6-e9aa-4e51-8b07-311bb7512d89 : no xattrs /data/brick/callrec/.glusterfs/a1/a4/a1a4c18d-8807-4a43-b9d2-99cf33a02c03 : file not found /data/brick/callrec/.glusterfs/93/dc/93dcfe76-318c-43c3-908c-0332201387f4 : file not found /data/brick/callrec/.glusterfs/4d/6b/4d6be1c5-2282-441f-97e1-71c8c1e42aa1 : file not found /data/brick/callrec/.glusterfs/e4/ca/e4ca593f-73f5-4d9f-b84f-16168b4f84f1 : file not found /data/brick/callrec/.glusterfs/9d/80/9d80ae3c-833d-4a7c-81fc-b4ba9e645d5f : no xattrs /data/brick/callrec/.glusterfs/c7/82/c78213fd-a60f-41fe-8471-6c8a92fdd873 : file not found /data/brick/callrec/.glusterfs/7c/e8/7ce8062a-37fd-4388-aa71-42e3f1125d20 : no xattrs /data/brick/callrec/.glusterfs/85/ed/85ed9df0-64e1-4d93-8572-a5506c2c2d01 : file not found /data/brick/callrec/.glusterfs/3b/c9/3bc95362-be4f-4561-b95f-4cd1260c1781 : no xattrs /data/brick/callrec/.glusterfs/18/e2/18e2f17d-801f-428d-98c1-c35f7ac8a68d : no xattrs /data/brick/callrec/.glusterfs/70/ea/70eab476-d53d-4f2b-b56f-13cf12821a24 : no xattrs /data/brick/callrec/.glusterfs/0b/07/0b07c592-f784-4d08-a3a2-3a0de0df666e : file not found /data/brick/callrec/.glusterfs/1c/62/1c623eb2-0594-48bf-accb-0c7b0a3a530a : file not found /data/brick/callrec/.glusterfs/18/81/1881cc27-09ab-453e-b89c-afcce7aab5ea : file not found /data/brick/callrec/.glusterfs/3a/50/3a50792e-0ce9-49a2-99d5-d59d1fd7e1a9 : no xattrs /data/brick/callrec/.glusterfs/9c/f4/9cf45591-c304-4348-b6ce-9dc9f8f335b1 : no xattrs /data/brick/callrec/.glusterfs/f3/4d/f34d746e-1555-4b54-9ebd-2bfe7c96b0ef : no xattrs /data/brick/callrec/.glusterfs/84/d3/84d3674c-6ec4-43b7-b590-8a125cbfbe43 : no xattrs /data/brick/callrec/.glusterfs/21/f7/21f767b1-b5f3-4441-8f0e-ac852d2cdd25 : file not found /data/brick/callrec/.glusterfs/f7/d3/f7d3ef0f-3c01-412e-93fa-704171267d9e : no xattrs /data/brick/callrec/.glusterfs/33/10/331058cb-2fcf-4198-910f-5c7b78100807 : file not found /data/brick/callrec/.glusterfs/2f/ea/2feaebdb-6eba-4e01-80ae-0812e5c770c6 : file not found /data/brick/callrec/.glusterfs/fa/68/fa685010-3cd1-4b18-bc37-7db6c2e543ee : file not found /data/brick/callrec/.glusterfs/40/3c/403c0233-432f-42d8-8ed0-eeca68c4b3f1 : no xattrs /data/brick/callrec/.glusterfs/9e/6c/9e6c8760-48e1-4b84-9ad4-e6a33c881b6c : file not found /data/brick/callrec/.glusterfs/ef/58/ef583ff4-72b3-408d-bf34-88ca5534c71e : no xattrs /data/brick/callrec/.glusterfs/c3/41/c3419ad6-3e32-4fc5-93da-2b04a6090cfa : no xattrs /data/brick/callrec/.glusterfs/ea/a4/eaa43674-b1a3-4833-a946-de7b7121bb88 : file not found /data/brick/callrec/.glusterfs/79/3a/793a81de-73e8-4d84-8ac1-be03c5ac2c47 : no xattrs /data/brick/callrec/.glusterfs/e1/d9/e1d92bdc-0fe9-4531-9fc0-0c9e227721c0 : file not found /data/brick/callrec/.glusterfs/d6/c6/d6c6e6b4-d476-4b9b-990d-b79e66c490bf : file not found /data/brick/callrec/.glusterfs/25/af/25af9c71-aab5-4f39-bd4a-bb4f0dab9342 : no xattrs /data/brick/callrec/.glusterfs/0d/16/0d1671b4-e31f-4c81-8600-fe63ffc84272 : no xattrs /data/brick/callrec/.glusterfs/eb/87/eb87548b-a90a-458d-b215-939ad59f5ec0 : no xattrs /data/brick/callrec/.glusterfs/9a/2c/9a2c64de-a948-4a45-9466-58a0fb556ba4 : file not found /data/brick/callrec/.glusterfs/26/be/26bef74c-9675-4c74-96d0-327310cb0983 : file not found /data/brick/callrec/.glusterfs/9c/62/9c6254df-e8dd-40a3-a7d1-760f6c19027a : no xattrs /data/brick/callrec/.glusterfs/b1/13/b1138fe2-34cf-4d0d-af99-3b9d9aec0317 : no xattrs /data/brick/callrec/.glusterfs/52/2b/522b0117-7d22-4e94-b767-d62a5c914a62 : file not found /data/brick/callrec/.glusterfs/4a/3e/4a3ed044-845f-438f-8d8a-cac1c0d853f0 : file not found /data/brick/callrec/.glusterfs/2b/bb/2bbb8529-4874-440d-959b-6a0745fdfda9 : no xattrs /data/brick/callrec/.glusterfs/94/b5/94b5924f-65b9-45d6-8a03-b7084f8c4bdb : no xattrs /data/brick/callrec/.glusterfs/16/fd/16fd1032-3bd5-474a-b30e-85971d69aaa9 : no xattrs /data/brick/callrec/.glusterfs/ef/bf/efbfc625-c42c-4d5e-a22a-173f84362f24 : no xattrs /data/brick/callrec/.glusterfs/55/4c/554ce007-a0bd-437b-8004-d70d110b5acc : file not found /data/brick/callrec/.glusterfs/7a/d4/7ad41386-9d57-4a93-9fc2-3354e40d9927 : no xattrs /data/brick/callrec/.glusterfs/32/2c/322c7748-459c-437e-bfba-6f9096a938c5 : file not found /data/brick/callrec/.glusterfs/f4/8b/f48beb09-975f-4ff6-843e-d8906d3b21b3 : no xattrs /data/brick/callrec/.glusterfs/5c/f3/5cf36470-8d7e-4e26-904c-39a15d773b43 : file not found /data/brick/callrec/.glusterfs/75/c6/75c6e98b-7700-46f0-8be4-e897f969a5df : no xattrs /data/brick/callrec/.glusterfs/d3/f5/d3f58efb-d1ac-42ab-957c-d7b684bf0972 : file not found /data/brick/callrec/.glusterfs/8f/3a/8f3a7de0-173a-4507-9b46-bc9db0a6bc41 : no xattrs /data/brick/callrec/.glusterfs/ca/5b/ca5b027b-dc2f-468f-8cb0-131f1a9099d8 : file not found /data/brick/callrec/.glusterfs/19/29/19291239-e44c-4025-9f74-af7431aac6b9 : no xattrs /data/brick/callrec/.glusterfs/28/07/2807b990-2d3c-4f82-b89b-36637cc7c181 : file not found /data/brick/callrec/.glusterfs/3d/f8/3df80623-6bf4-47e3-a379-4e5605d0eda6 : no xattrs /data/brick/callrec/.glusterfs/24/72/247283c9-f427-43ce-869a-895e31a9e891 : file not found /data/brick/callrec/.glusterfs/a4/74/a4745c5f-a3ad-4bed-a504-5ae5b31200a3 : no xattrs /data/brick/callrec/.glusterfs/d1/4c/d14c8834-a624-4288-8200-5b42fa165043 : file not found /data/brick/callrec/.glusterfs/53/75/5375e1e8-287f-4fc5-8b01-ec2f856eebf7 : no xattrs /data/brick/callrec/.glusterfs/fe/81/fe814468-5a2c-469c-89de-9a96b9aacbc1 : file not found /data/brick/callrec/.glusterfs/50/78/50787eb7-90ad-41fb-bd5e-3ba058b69c32 : no xattrs /data/brick/callrec/.glusterfs/34/49/3449d319-e2d9-4444-8823-8f7351c1d45f : file not found /data/brick/callrec/.glusterfs/9e/a0/9ea01327-d5ed-48bd-8049-cc552555b774 : no xattrs /data/brick/callrec/.glusterfs/45/ae/45aed765-2ae2-4fc8-9982-8e1b5d6c19d4 : no xattrs /data/brick/callrec/.glusterfs/70/b7/70b7db21-b70f-4ae0-8660-921c3194f209 : no xattrs /data/brick/callrec/.glusterfs/9f/ce/9fce8690-a9f7-4cbb-b13a-ba149af652b4 : file not found /data/brick/callrec/.glusterfs/f1/2b/f12b4d8e-b5e6-4f71-a06c-0c0d067a7eb9 : no xattrs /data/brick/callrec/.glusterfs/45/5c/455c2911-7a9d-401f-9ded-4380c6cec405 : no xattrs /data/brick/callrec/.glusterfs/ef/bd/efbd6028-b9f8-45de-a653-b27f75570b81 : no xattrs /data/brick/callrec/.glusterfs/f7/0b/f70b50fb-3061-4eb9-94f0-85fb2d789d27 : no xattrs /data/brick/callrec/.glusterfs/33/8b/338baad1-6191-47f3-9737-dca2daf79fd8 : no xattrs /data/brick/callrec/.glusterfs/a9/e1/a9e1cad9-cd31-48f0-b9f8-322e1f602401 : no xattrs /data/brick/callrec/.glusterfs/7d/52/7d522a0b-75fa-4b82-a0a2-4a81c98eea03 : file not found /data/brick/callrec/.glusterfs/1b/a8/1ba8d90a-5f0c-4387-a79a-592427f3d1c5 : no xattrs /data/brick/callrec/.glusterfs/61/bf/61bf5407-c250-4448-a68f-d3bd82821260 : no xattrs /data/brick/callrec/.glusterfs/5e/e3/5ee303e4-44fe-44a1-83b5-5cd6a91bc76a : no xattrs /data/brick/callrec/.glusterfs/68/12/68127ab7-807d-4a99-a609-6569d366d3aa : no xattrs /data/brick/callrec/.glusterfs/a6/cb/a6cb8d18-f2b3-48d4-ac37-c022603b8f8e : no xattrs /data/brick/callrec/.glusterfs/4e/86/4e86088b-f975-4228-ad5a-8d08fbd456fe : no xattrs /data/brick/callrec/.glusterfs/c4/a2/c4a200b9-bb31-46cb-92c6-551fd6ad9ec3 : no xattrs /data/brick/callrec/.glusterfs/e3/56/e356342a-3dab-4049-b4bc-ea0de4a8ee87 : no xattrs /data/brick/callrec/.glusterfs/da/d2/dad27361-d38f-46b3-a77c-b6ac5df054d9 : no xattrs /data/brick/callrec/.glusterfs/a5/be/a5be4523-2d2c-4894-b332-a7e3370658d3 : file not found /data/brick/callrec/.glusterfs/b0/f4/b0f4e951-63b3-4152-b0b0-7aad8e6ed729 : no xattrs /data/brick/callrec/.glusterfs/23/e7/23e75764-9794-4efd-b319-6eac717e6f28 : no xattrs Where do I go from here? Cheers, Kingsley. On Fri, 2016-07-15 at 17:08 +0530, Ravishankar N wrote: > Can you check the getfattr output of a few of those 129 entries from all > bricks? You basically need to see if there are non zero afr-xattrs for > the files in question which would indicate a pending heal. > > -Ravi > > On 07/08/2016 03:12 PM, Kingsley wrote: > > Further to this, I've noticed something which might have been a bit of a > > red herring in my previous post. > > > > We have 3 volumes - gv0, voicemail and callrec. callrec is the only one > > showing self heal entries, yet all of the "No such file or directory" > > errors in glustershd.log appear to refer to gv0. gv0 has no self heal > > entries shown by "gluster volume heal gv0 info", and no split brain > > entries either. > > > > If I de-dupe those log entries, I just get these: > > > > [root@gluster1a-1 glusterfs]# grep gfid: glustershd.log | awk -F\] '{print $3}' | sort | uniq > > 0-gv0-client-0: remote operation failed: No such file or directory. Path: <gfid:08713e43-7bcb-43f3-818a-7b062abd6e95> (08713e43-7bcb-43f3-818a-7b062abd6e95) > > 0-gv0-client-0: remote operation failed: No such file or directory. Path: <gfid:436dcbec-a12a-4df9-b8ef-bae977c98537> (436dcbec-a12a-4df9-b8ef-bae977c98537) > > 0-gv0-client-0: remote operation failed: No such file or directory. Path: <gfid:81dc9194-2379-40b5-a949-f7550433b2e0> (81dc9194-2379-40b5-a949-f7550433b2e0) > > 0-gv0-client-0: remote operation failed: No such file or directory. Path: <gfid:b1e273ad-9eb1-4f97-a41c-39eecb149bd6> (b1e273ad-9eb1-4f97-a41c-39eecb149bd6) > > 0-gv0-client-1: remote operation failed: No such file or directory. Path: <gfid:08713e43-7bcb-43f3-818a-7b062abd6e95> (08713e43-7bcb-43f3-818a-7b062abd6e95) > > 0-gv0-client-1: remote operation failed: No such file or directory. Path: <gfid:436dcbec-a12a-4df9-b8ef-bae977c98537> (436dcbec-a12a-4df9-b8ef-bae977c98537) > > 0-gv0-client-1: remote operation failed: No such file or directory. Path: <gfid:81dc9194-2379-40b5-a949-f7550433b2e0> (81dc9194-2379-40b5-a949-f7550433b2e0) > > 0-gv0-client-3: remote operation failed: No such file or directory. Path: <gfid:08713e43-7bcb-43f3-818a-7b062abd6e95> (08713e43-7bcb-43f3-818a-7b062abd6e95) > > 0-gv0-client-3: remote operation failed: No such file or directory. Path: <gfid:81dc9194-2379-40b5-a949-f7550433b2e0> (81dc9194-2379-40b5-a949-f7550433b2e0) > > > > > > There doesn't seem anything obvious to me in glustershd.log about the > > callrec volume. On one of the bricks that stayed up: > > > > [root@gluster1a-1 glusterfs]# grep callrec glustershd.log > > [2016-07-08 08:54:03.424446] I [graph.c:269:gf_add_cmdline_options] 0-callrec-replicate-0: adding option 'node-uuid' for volume 'callrec-replicate-0' with value 'b9d3b1a2-3214-41ba-a1c9-9c7d4b18ff5d' > > [2016-07-08 08:54:03.429663] I [client.c:2280:notify] 0-callrec-client-0: parent translators are ready, attempting connect on transport > > [2016-07-08 08:54:03.432198] I [client.c:2280:notify] 0-callrec-client-1: parent translators are ready, attempting connect on transport > > [2016-07-08 08:54:03.434375] I [client.c:2280:notify] 0-callrec-client-2: parent translators are ready, attempting connect on transport > > [2016-07-08 08:54:03.436521] I [client.c:2280:notify] 0-callrec-client-3: parent translators are ready, attempting connect on transport > > 1: volume callrec-client-0 > > 5: option remote-subvolume /data/brick/callrec > > 11: volume callrec-client-1 > > 15: option remote-subvolume /data/brick/callrec > > 21: volume callrec-client-2 > > 25: option remote-subvolume /data/brick/callrec > > 31: volume callrec-client-3 > > 35: option remote-subvolume /data/brick/callrec > > 41: volume callrec-replicate-0 > > 50: subvolumes callrec-client-0 callrec-client-1 callrec-client-2 callrec-client-3 > > 159: subvolumes callrec-replicate-0 gv0-replicate-0 voicemail-replicate-0 > > [2016-07-08 08:54:03.458708] I [rpc-clnt.c:1761:rpc_clnt_reconfig] 0-callrec-client-0: changing port to 49153 (from 0) > > [2016-07-08 08:54:03.465684] I [client-handshake.c:1413:select_server_supported_programs] 0-callrec-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330) > > [2016-07-08 08:54:03.465921] I [client-handshake.c:1200:client_setvolume_cbk] 0-callrec-client-0: Connected to callrec-client-0, attached to remote volume '/data/brick/callrec'. > > [2016-07-08 08:54:03.465927] I [client-handshake.c:1210:client_setvolume_cbk] 0-callrec-client-0: Server and Client lk-version numbers are not same, reopening the fds > > [2016-07-08 08:54:03.465967] I [MSGID: 108005] [afr-common.c:3669:afr_notify] 0-callrec-replicate-0: Subvolume 'callrec-client-0' came back up; going online. > > [2016-07-08 08:54:03.466108] I [client-handshake.c:188:client_set_lk_version_cbk] 0-callrec-client-0: Server lk version = 1 > > [2016-07-08 08:54:04.266979] I [rpc-clnt.c:1761:rpc_clnt_reconfig] 0-callrec-client-1: changing port to 49153 (from 0) > > [2016-07-08 08:54:04.732625] I [rpc-clnt.c:1761:rpc_clnt_reconfig] 0-callrec-client-2: changing port to 49153 (from 0) > > [2016-07-08 08:54:04.738533] I [client-handshake.c:1413:select_server_supported_programs] 0-callrec-client-2: Using Program GlusterFS 3.3, Num (1298437), Version (330) > > [2016-07-08 08:54:04.738911] I [client-handshake.c:1200:client_setvolume_cbk] 0-callrec-client-2: Connected to callrec-client-2, attached to remote volume '/data/brick/callrec'. > > [2016-07-08 08:54:04.738921] I [client-handshake.c:1210:client_setvolume_cbk] 0-callrec-client-2: Server and Client lk-version numbers are not same, reopening the fds > > [2016-07-08 08:54:04.739181] I [client-handshake.c:188:client_set_lk_version_cbk] 0-callrec-client-2: Server lk version = 1 > > [2016-07-08 08:54:05.271388] I [client-handshake.c:1413:select_server_supported_programs] 0-callrec-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330) > > [2016-07-08 08:54:05.271858] I [client-handshake.c:1200:client_setvolume_cbk] 0-callrec-client-1: Connected to callrec-client-1, attached to remote volume '/data/brick/callrec'. > > [2016-07-08 08:54:05.271879] I [client-handshake.c:1210:client_setvolume_cbk] 0-callrec-client-1: Server and Client lk-version numbers are not same, reopening the fds > > [2016-07-08 08:54:05.272185] I [client-handshake.c:188:client_set_lk_version_cbk] 0-callrec-client-1: Server lk version = 1 > > [2016-07-08 08:54:06.302301] I [rpc-clnt.c:1761:rpc_clnt_reconfig] 0-callrec-client-3: changing port to 49153 (from 0) > > [2016-07-08 08:54:06.305473] I [client-handshake.c:1413:select_server_supported_programs] 0-callrec-client-3: Using Program GlusterFS 3.3, Num (1298437), Version (330) > > [2016-07-08 08:54:06.305915] I [client-handshake.c:1200:client_setvolume_cbk] 0-callrec-client-3: Connected to callrec-client-3, attached to remote volume '/data/brick/callrec'. > > [2016-07-08 08:54:06.305925] I [client-handshake.c:1210:client_setvolume_cbk] 0-callrec-client-3: Server and Client lk-version numbers are not same, reopening the fds > > [2016-07-08 08:54:06.306307] I [client-handshake.c:188:client_set_lk_version_cbk] 0-callrec-client-3: Server lk version = 1 > > > > > > And on the brick that went offline for a few days: > > > > [root@gluster2a-1 glusterfs]# grep callrec glustershd.log > > [2016-07-08 08:54:06.900964] I [graph.c:269:gf_add_cmdline_options] 0-callrec-replicate-0: adding option 'node-uuid' for volume 'callrec-replicate-0' with value 'e96ae8cd-f38f-4c2a-bb3b-baeb78f88f13' > > [2016-07-08 08:54:06.906449] I [client.c:2280:notify] 0-callrec-client-0: parent translators are ready, attempting connect on transport > > [2016-07-08 08:54:06.908851] I [client.c:2280:notify] 0-callrec-client-1: parent translators are ready, attempting connect on transport > > [2016-07-08 08:54:06.911045] I [client.c:2280:notify] 0-callrec-client-2: parent translators are ready, attempting connect on transport > > [2016-07-08 08:54:06.913528] I [client.c:2280:notify] 0-callrec-client-3: parent translators are ready, attempting connect on transport > > 1: volume callrec-client-0 > > 5: option remote-subvolume /data/brick/callrec > > 11: volume callrec-client-1 > > 15: option remote-subvolume /data/brick/callrec > > 21: volume callrec-client-2 > > 25: option remote-subvolume /data/brick/callrec > > 31: volume callrec-client-3 > > 35: option remote-subvolume /data/brick/callrec > > 41: volume callrec-replicate-0 > > 50: subvolumes callrec-client-0 callrec-client-1 callrec-client-2 callrec-client-3 > > 159: subvolumes callrec-replicate-0 gv0-replicate-0 voicemail-replicate-0 > > [2016-07-08 08:54:06.938769] I [rpc-clnt.c:1761:rpc_clnt_reconfig] 0-callrec-client-2: changing port to 49153 (from 0) > > [2016-07-08 08:54:06.948204] I [rpc-clnt.c:1761:rpc_clnt_reconfig] 0-callrec-client-1: changing port to 49153 (from 0) > > [2016-07-08 08:54:06.951625] I [client-handshake.c:1413:select_server_supported_programs] 0-callrec-client-2: Using Program GlusterFS 3.3, Num (1298437), Version (330) > > [2016-07-08 08:54:06.951849] I [client-handshake.c:1200:client_setvolume_cbk] 0-callrec-client-2: Connected to callrec-client-2, attached to remote volume '/data/brick/callrec'. > > [2016-07-08 08:54:06.951858] I [client-handshake.c:1210:client_setvolume_cbk] 0-callrec-client-2: Server and Client lk-version numbers are not same, reopening the fds > > [2016-07-08 08:54:06.951906] I [MSGID: 108005] [afr-common.c:3669:afr_notify] 0-callrec-replicate-0: Subvolume 'callrec-client-2' came back up; going online. > > [2016-07-08 08:54:06.951938] I [client-handshake.c:188:client_set_lk_version_cbk] 0-callrec-client-2: Server lk version = 1 > > [2016-07-08 08:54:07.152217] I [rpc-clnt.c:1761:rpc_clnt_reconfig] 0-callrec-client-3: changing port to 49153 (from 0) > > [2016-07-08 08:54:07.167137] I [client-handshake.c:1413:select_server_supported_programs] 0-callrec-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330) > > [2016-07-08 08:54:07.167474] I [client-handshake.c:1200:client_setvolume_cbk] 0-callrec-client-1: Connected to callrec-client-1, attached to remote volume '/data/brick/callrec'. > > [2016-07-08 08:54:07.167483] I [client-handshake.c:1210:client_setvolume_cbk] 0-callrec-client-1: Server and Client lk-version numbers are not same, reopening the fds > > [2016-07-08 08:54:07.167664] I [client-handshake.c:188:client_set_lk_version_cbk] 0-callrec-client-1: Server lk version = 1 > > [2016-07-08 08:54:07.240249] I [rpc-clnt.c:1761:rpc_clnt_reconfig] 0-callrec-client-0: changing port to 49153 (from 0) > > [2016-07-08 08:54:07.243156] I [client-handshake.c:1413:select_server_supported_programs] 0-callrec-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330) > > [2016-07-08 08:54:07.243512] I [client-handshake.c:1200:client_setvolume_cbk] 0-callrec-client-0: Connected to callrec-client-0, attached to remote volume '/data/brick/callrec'. > > [2016-07-08 08:54:07.243520] I [client-handshake.c:1210:client_setvolume_cbk] 0-callrec-client-0: Server and Client lk-version numbers are not same, reopening the fds > > [2016-07-08 08:54:07.243804] I [client-handshake.c:188:client_set_lk_version_cbk] 0-callrec-client-0: Server lk version = 1 > > [2016-07-08 08:54:07.400188] I [client-handshake.c:1413:select_server_supported_programs] 0-callrec-client-3: Using Program GlusterFS 3.3, Num (1298437), Version (330) > > [2016-07-08 08:54:07.400574] I [client-handshake.c:1200:client_setvolume_cbk] 0-callrec-client-3: Connected to callrec-client-3, attached to remote volume '/data/brick/callrec'. > > [2016-07-08 08:54:07.400583] I [client-handshake.c:1210:client_setvolume_cbk] 0-callrec-client-3: Server and Client lk-version numbers are not same, reopening the fds > > [2016-07-08 08:54:07.400802] I [client-handshake.c:188:client_set_lk_version_cbk] 0-callrec-client-3: Server lk version = 1 > > > > Cheers, > > Kingsley. > > > > On Fri, 2016-07-08 at 10:08 +0100, Kingsley wrote: > >> Hi, > >> > >> One of our bricks was offline for a few days when it didn't reboot after > >> a yum update (the gluster version wasn't changed). The volume heal info > >> is showing the same 129 entries, all of the format > >> <gfid:08713e43-7bcb-43f3-818a-7b062abd6e95> on the 3 bricks that > >> remained up, and no entries on the brick that was offline. > >> > >> glustershd.log on the brick that was offline has stuff like this in it: > >> > >> [2016-07-08 08:54:07.411486] I [client-handshake.c:1200:client_setvolume_cbk] 0-gv0-client-1: Connected to gv0-client-1, attached to remote volume '/data/brick/gv0'. > >> [2016-07-08 08:54:07.411493] I [client-handshake.c:1210:client_setvolume_cbk] 0-gv0-client-1: Server and Client lk-version numbers are not same, reopening the fds > >> [2016-07-08 08:54:07.411678] I [client-handshake.c:188:client_set_lk_version_cbk] 0-gv0-client-1: Server lk version = 1 > >> [2016-07-08 08:54:07.793661] I [client-handshake.c:1200:client_setvolume_cbk] 0-gv0-client-3: Connected to gv0-client-3, attached to remote volume '/data/brick/gv0'. > >> [2016-07-08 08:54:07.793688] I [client-handshake.c:1210:client_setvolume_cbk] 0-gv0-client-3: Server and Client lk-version numbers are not same, reopening the fds > >> [2016-07-08 08:54:07.794091] I [client-handshake.c:188:client_set_lk_version_cbk] 0-gv0-client-3: Server lk version = 1 > >> > >> but glustershd.log on the other 3 bricks has many lines looking like > >> this: > >> > >> [2016-07-08 09:05:17.203017] W [client-rpc-fops.c:2772:client3_3_lookup_cbk] 0-gv0-client-3: remote operation failed: No such file or directory. Path: <gfid:81dc9194-2379-40b5-a949-f7550433b2e0> (81dc9194-2379-40b5-a949-f7550433b2e0) > >> [2016-07-08 09:05:17.203405] W [client-rpc-fops.c:2772:client3_3_lookup_cbk] 0-gv0-client-0: remote operation failed: No such file or directory. Path: <gfid:b1e273ad-9eb1-4f97-a41c-39eecb149bd6> (b1e273ad-9eb1-4f97-a41c-39eecb149bd6) > >> [2016-07-08 09:05:17.204035] W [client-rpc-fops.c:2772:client3_3_lookup_cbk] 0-gv0-client-0: remote operation failed: No such file or directory. Path: <gfid:436dcbec-a12a-4df9-b8ef-bae977c98537> (436dcbec-a12a-4df9-b8ef-bae977c98537) > >> [2016-07-08 09:05:17.204225] W [client-rpc-fops.c:2772:client3_3_lookup_cbk] 0-gv0-client-1: remote operation failed: No such file or directory. Path: <gfid:436dcbec-a12a-4df9-b8ef-bae977c98537> (436dcbec-a12a-4df9-b8ef-bae977c98537) > >> [2016-07-08 09:05:17.204651] W [client-rpc-fops.c:2772:client3_3_lookup_cbk] 0-gv0-client-0: remote operation failed: No such file or directory. Path: <gfid:08713e43-7bcb-43f3-818a-7b062abd6e95> (08713e43-7bcb-43f3-818a-7b062abd6e95) > >> [2016-07-08 09:05:17.204879] W [client-rpc-fops.c:2772:client3_3_lookup_cbk] 0-gv0-client-1: remote operation failed: No such file or directory. Path: <gfid:08713e43-7bcb-43f3-818a-7b062abd6e95> (08713e43-7bcb-43f3-818a-7b062abd6e95) > >> [2016-07-08 09:05:17.205042] W [client-rpc-fops.c:2772:client3_3_lookup_cbk] 0-gv0-client-3: remote operation failed: No such file or directory. Path: <gfid:08713e43-7bcb-43f3-818a-7b062abd6e95> (08713e43-7bcb-43f3-818a-7b062abd6e95) > >> > >> How do I fix this? I need to update the other bricks but am reluctant to > >> do so until the volume is in good shape first. > >> > >> We're running Gluster 3.6.3 on CentOS 7. Volume info: > >> > >> Volume Name: callrec > >> Type: Replicate > >> Volume ID: a39830b7-eddb-4061-b381-39411274131a > >> Status: Started > >> Number of Bricks: 1 x 4 = 4 > >> Transport-type: tcp > >> Bricks: > >> Brick1: gluster1a-1:/data/brick/callrec > >> Brick2: gluster1b-1:/data/brick/callrec > >> Brick3: gluster2a-1:/data/brick/callrec > >> Brick4: gluster2b-1:/data/brick/callrec > >> Options Reconfigured: > >> performance.flush-behind: off > >> > >> > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users@xxxxxxxxxxx > > http://www.gluster.org/mailman/listinfo/gluster-users > > > _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users