Re: Slef-heal still not finished after 2 days

Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> · Mon, 30 Jun 2014 09:33:53 +0530

On 06/30/2014 09:17 AM, John Gardeniers wrote:
Hi Pranith,

On 30/06/14 13:37, Pranith Kumar Karampuri wrote:
On 06/30/2014 08:48 AM, John Gardeniers wrote:
Hi again Pranith,

On 30/06/14 11:58, Pranith Kumar Karampuri wrote:
Oops, I see you are the same user who posted about VM files self-heal.
Sorry I couldn't get back in time. So you are using 3.4.2.
Could you post logfiles of mount, bricks please. That should help us
to find more information about any issues.

When you say the log for the mount, which log is that? There are none
that I can identify with the mount.

gluster volume heal <volname> info heal-failed records the last 1024
failures. It also prints the timestamp of when the failures occurred.
Even after the heal is successful it keeps showing the errors. So
timestamp of when the heal failed is important. Because some of these
commands are causing such confusion we depracated these commands in
upcoming releases (3.6).

So far I've been focusing on the heal-failed count, which I fully, and I
believe understandably, expect to show zero when there are no errors.
Now that I look at the timestamps of those errors I realise they are all
from *before* the slave brick was added back in. May I assume then that
in reality there are no unhealed files? If this is correct, I must point
out that if errors are reported when there are none that is a massive
design flaw. It means things like nagios checks, such as the one we use,
are useless. This makes monitoring near enough to impossible.
Exactly, that is why we deprecated it. The goal is to show only the
files that need to be healed, which is achieved in 3.5.1.
Just "gluster volume heal info". It shows the exact number of
files/directories that need to be healed. Once it becomes zero,
we know the healing is complete. But all of these are useful only when
the brick is not erased. We still need to improve at monitoring
when the brick is erased and we trigger full volume self-heal using
"gluster volume heal <volname> full" like you did.

Raised the following bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1114415 to address the same.

Thanks a lot for you inputs John. We shall fix these with priority.

In I run  "watch gluster volume heal gluster-rhev info" I get a
constantly changing output similar to below, except the numbers and the
files are changing. I believe this is normal, as it is what I have seen
even when everything was running normally (before the problem started).
This is also why the nagios check uses "gluster volume heal gluster-rhev
info heal-failed". If that command is removed and not replaced with
something else it removes any possibility of monitoring heal failures.

That generally is a good news. But we can still try to make sure that 
all the heals are complete.
Most of the self-heal failures are intermittent, and subsequent attempts 
of healing will be successful.
We found the only thing 'gluster volume heal <volname> info heal-failed'
command achieved is un-necessary panic to the users. That is the main 
reason we deprecated it.

The main things we need is to know are files which need to be healed and 
files which are in split-brain. For the rest
of the things we generally need more info in debugging, so logs are 
preferable. That is going to be the monitoring
story going forward. We still need to improve on 'heal full' monitoring 
story though. I hope things will progressively
improve.

Pranith.

Brick jupiter.om.net:/gluster_brick_1
Number of entries: 11
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/c9de61dc-286a-456a-bc3b-e2755ca5c8b3/ac3f2166-61af-4fc0-99c4-a76e9b63000e
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/0483fad0-2dfc-4dbe-98d4-62dbdbe120f3/1d3812af-caa4-4d99-a598-04dfd78eb6cb
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/bc3b165b-619a-4afa-9d5c-5ea599a99c61/2203cc21-db54-4f69-bbf1-4884a86c05d0
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/dd344276-04ef-48d1-9953-02bee5cc3b87/786b1a84-bc9c-48c7-859b-844a383e47ec
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/3d37825c-c798-421b-96ff-ce07128ee3ad/5119ad56-f0c9-4b3f-8f84-71e5f4b6b693
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/ca843e93-447e-4155-83fc-e0d7586b4b50/215e0914-4def-4230-9f2a-a9ece61f2038
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/51927b5f-2a3c-4c04-a90b-4500be0a526c/d14a842e-5fd9-4f6f-b08e-f5895b8b72fd
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/f06bcb57-4a0a-446d-b71d-773595bb0e2f/4dc55bae-5881-4a04-9815-18cdeb8bcfc8
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/cc11aab2-530e-4ffa-84ee-2989d39efeb8/49b2ff17-096b-45cf-a973-3d1466e16066
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/903c3ef4-feaa-4262-9654-69ef118a43ce/8c6732d3-4ce8-422e-a74e-48151e7f7102
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/66dcc584-36ce-43e6-8ce4-538fc6ff03d1/44192148-0708-4bd3-b8da-30baa85b89bf

Brick nix.om.net:/gluster_brick_1
Number of entries: 11
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/29b3b7cf-7b21-44e7-bede-b86e12d2b69a/7fbce4ad-185b-42d8-a093-168560a3df89
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/903c3ef4-feaa-4262-9654-69ef118a43ce/8c6732d3-4ce8-422e-a74e-48151e7f7102
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/bc3b165b-619a-4afa-9d5c-5ea599a99c61/2203cc21-db54-4f69-bbf1-4884a86c05d0
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/c9de61dc-286a-456a-bc3b-e2755ca5c8b3/ac3f2166-61af-4fc0-99c4-a76e9b63000e
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/3d37825c-c798-421b-96ff-ce07128ee3ad/5119ad56-f0c9-4b3f-8f84-71e5f4b6b693
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/66dcc584-36ce-43e6-8ce4-538fc6ff03d1/44192148-0708-4bd3-b8da-30baa85b89bf
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/0483fad0-2dfc-4dbe-98d4-62dbdbe120f3/1d3812af-caa4-4d99-a598-04dfd78eb6cb
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/dd344276-04ef-48d1-9953-02bee5cc3b87/786b1a84-bc9c-48c7-859b-844a383e47ec
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/ca843e93-447e-4155-83fc-e0d7586b4b50/215e0914-4def-4230-9f2a-a9ece61f2038
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/51927b5f-2a3c-4c04-a90b-4500be0a526c/d14a842e-5fd9-4f6f-b08e-f5895b8b72fd
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/f06bcb57-4a0a-446d-b71d-773595bb0e2f/4dc55bae-5881-4a04-9815-18cdeb8bcfc8

Do you think it is possible for you to come to #gluster IRC on freenode?
I'll see what I can do. I've never used IRC before and first need to
find out how. :)

Pranith

This is probably a stupid question but let me ask it anyway. When a
brick contents are erased from backend
we need to make sure about the following two things:
1) Extended attributes of the root brick is showing pending operations
on the brick that is erased
2) Execute "gluster volume heal <volname> full"
1) While gluster was stopped I merely did an rm -rf on both the data
sub-directory and the .gluster sub-directory. How do I show that there
are pending operations?
2) Yes, I did run that.

Did you do the steps above?

Since you are on 3.4.2 I think best way to check what files are healed
is using extended attributes in the backend. Could you please post
them again.
I don't quite understand what you're asking for. I understand attributes
as belonging to files and directories, not operations. Please elaborate.

Pranith

On 06/30/2014 07:12 AM, Pranith Kumar Karampuri wrote:
On 06/30/2014 04:03 AM, John Gardeniers wrote:
Hi All,

We have 2 servers, each with on 5TB brick, configured as replica 2.
After a series of events that caused the 2 bricks to become way
out of
step gluster was turned off on one server and its brick was wiped of
everything but the attributes were untouched.

This weekend we stopped the client and gluster and made a backup
of the
remaining brick, just to play safe. Gluster was then turned back on,
first on the "master" and then on the "slave". Self-heal kicked in
and
started rebuilding the second brick. However, after 2 full days all
files in the volume are still showing heal failed errors.

The rebuild was, in my opinion at least, very slow, taking most of a
day
even though the system is on a 10Gb LAN. The data is a little under
1.4TB committed, 2TB allocated.
How much more to be healed? 0.6TB?
Once the 2 bricks were very close to having the same amount of space
used things slowed right down. For the last day both bricks show a
very
slow increase in used space, even though there are no changes being
written by the client. By slow I mean just a few KB per minute.
Is the I/O still in progress on the mount? Self-heal doesn't happen
on files where I/O is going on mounts in 3.4.x. So that could be the
reason if I/O is going on.
The logs are confusing, to say the least. In
etc-glusterfs-glusterd.vol.log on both servers there are thousands of
entries such as (possibly because I was using watch to monitor
self-heal
progress):

[2014-06-29 21:41:11.289742] I
[glusterd-volume-ops.c:478:__glusterd_handle_cli_heal_volume]
0-management: Received heal vol req for volume gluster-rhev
What versoin of gluster are you using?
That timestamp is the latest on either server, that's about 9
hours ago
as I type this. I find that a bit disconcerting. I have requested
volume
heal-failed info since then.

The brick log on the "master" server (the one from which we are
rebuilding the new brick) contains no entries since before the
rebuild
started.

On the "slave" server the brick log shows a lot of entries such as:

[2014-06-28 08:49:47.887353] E [marker.c:2140:marker_removexattr_cbk]
0-gluster-rhev-marker: Numerical result out of range occurred while
creating symlinks
[2014-06-28 08:49:47.887382] I
[server-rpc-fops.c:745:server_removexattr_cbk] 0-gluster-rhev-server:
10311315: REMOVEXATTR
/44d30b24-1ed7-48a0-b905-818dc0a006a2/images/02d4bd3c-b057-4f04-ada5-838f83d0b761/d962466d-1894-4716-b5d0-3a10979145ec

(1c1f53ac-afe2-420d-8c93-b1eb53ffe8b1) of key  ==> (Numerical result
out
of range)
CC Raghavendra who knows about marker translator.
Those entries are around the time the rebuild was starting. The final
entries in that same log (immediately after those listed above) are:

[2014-06-29 12:47:28.473999] I
[server-rpc-fops.c:243:server_inodelk_cbk] 0-gluster-rhev-server:
2869:
INODELK (null) (c67e9bbe-5956-4c61-b650-2cd5df4c4df0) ==> (No such
file
or directory)
[2014-06-29 12:47:28.489527] I
[server-rpc-fops.c:1572:server_open_cbk]
0-gluster-rhev-server: 2870: OPEN (null)
(c67e9bbe-5956-4c61-b650-2cd5df4c4df0) ==> (No such file or
directory)
These logs are harmless and were fixed in 3.5 I think. Are you on
3.4.x?

As I type it's 2014-06-30 08:31.

What do they mean and how can I rectify it?

regards,
John

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users
______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud
service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________

______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users