Re: Confusion supreme

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Hi

First step would be to ensure that all clients are connected to all bricks - this will reduce the chance of new problems.

Well, when the disk broke, one brick was obviously offline. But
apart from that, I'm not sure I understand what you mean by
"ensure that all clients are connected to all bricks".

The way I have it is that on one node the local brick is mounted
and then its filesystem is used by applications. On the other two
nodes glusterfsd/glusterfs are running, but the bricks are not
mounted and not used. Which node is in use can vary, but it is
always only one.

For some reason there are problems with the broken node.

After replacing the broken disk I had the same problem on all
nodes.

Did you reduce the replica to 2 before reinstalling the broken node and re-adding it to the TSP ?

Yes. But even though I said "replica 2", the remove-brick command
refused to run without force. So I had to use force. Maybe that
is the cause of the subsequent inconsistencies.

Try to get the attributes and the blames of a few files.

It's too late now; I fixed the problem, so I can no longer investigate
it.

What I found is that the unhealable files existed on all three
bricks, but with different contents, ownerships and permissions.
Something like

-rw-r--r--   2 2004 2004   4074 Jun 12  2006 brick1/.glusterfs/00/01/0001055c-41e1-49da-aa98-9bc0246f70cd
-rw-r--r--   2    0    0      0 Jun 12  2006 brick2/.glusterfs/00/01/0001055c-41e1-49da-aa98-9bc0246f70cd
-rw-r--r--   2    0    0      0 Jun 12  2006 brick3/.glusterfs/00/01/0001055c-41e1-49da-aa98-9bc0246f70cd

where the file in brick 1 is the good one and the root-owned empty
files in bricks 2 and 3 made healing impossible. (The above listings
are illustrative and I don't remember whether the file mtimes matched
or not.)

The solution was to rsync -a the unhealable files from .glusterfs/
on the good brick to .glusterfs/ on the bad bricks and restart
healing. Then shd reported copying the files' metadata and the
volume was healed.

It is all very strange and I think I can smell bugs, but I can't
exactly put my finger on them.

Cheers,

Z


--
Слава Україні!
Путлер хуйло!
________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users




[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux