Re: Brick is not connected (and other funny things)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On storage2 all is ok, but i have a split brain condition. /export/brick1 on storage1 doesn't contain datas....
FYI, not split-brain. Split-brain is where two differing copies both think they're the correct one.

I replied to the questions you asked on my blog, but I'll repeat the answers here:

Should i stop the volume with gluster volume stop? Can i maintain up other volumes (and glusterfs) ?

No need to stop the volume. The brick cannot start without the volume-id so there's nothing running that could be affected.

Next i have to re assign the volume id. Is this right?
A this point can i re-start the corrupted volume?

Once the volume-id is set you can restart the brick by either restarting glusterd or "gluster volume start $vol force".

Should i start any healing procedure? How?

I would do a "gluster volume heal $vol full" to ensure all files are crawled for a heal.


On 10/08/2014 09:09 AM, Marco Marino wrote:
Can someone help me?
I'd like to restore my /export/brick1 on server1. Actually i have datas only on server2. 
I think that right instructions are:
1) setfattr -n ... on server1 ( this is a bug. Here more info -> http://www.joejulian.name/blog/replacing-a-brick-on-glusterfs-340/   I have the same error in logs)
2) Now i think i can re-start volume, so i should see an automatic healing procedure
3) All datas are replicated on server1

Can i have a confirmation about this procedure? Other volumes are affected? Please, i cannot loose my data

Thanks

2014-10-03 20:07 GMT+02:00 Marco Marino <marino.mrc@xxxxxxxxx>:
Hi,
I'm trying to use glusterfs with my openstack private cloud for storing ephemeral disks. In this way, each compute node mount glusterfs in /nova and save instances on a remote glusterfs (shared between the compute nodes, so live migration is very fast).
I have 2 storage node (storage1 and storage2) with replica 2.
In a first configuration i've used nfs on the clients. In /etc/fstab of the compute nodes i have:
storage1:/cloud_rootdisk /nova nfs mountproto=tcp,vers=3 0 0

This creates a single point of failure because if storage1 goes down, i have to remount manually on storage2. And this causes the complete disk corruption of all VMs that running on all the compute nodes. Really funny... 

In a second configuration, i've used the gluster native client with "backupvolfile-server=storage2". i've made few tests, but it seems to work.
What i've tested:
on the compute node i have:
mount -t glusterfs -o backupvolfile-server=server2,fetch-attempts=2,log-level=WARNING,log-file=/var/log/gluster.log server1:/test-volume /gluster_mount

Then, I booted a vm and started to download a large file (1GB) from the vm (so, i'm writing on the ephemeral disk stored via glusterfs). During this download, i rebooted storage1 and the VM seems to be not corrupted (so, the vm write only on storage2).
Can i have a confirmation about this? Is this the right way?

Next question:
When i rebooted storage1, it fails to start. it tells me that /dev/sdc1 (the partition that i'm using for the test) is corrupted. It could be a normal behavior because the server goes down during a write. So, started the storage1 in single user mode and xfs_repair /dev/sdc1. This make me able to start storage1. (yuppy)
Glusterfs starts correctly, but now i have "brick1 is not connected", where /export/brick1 is the brick that i'm using on storage1 for the volume used for tests.
On storage2 all is ok, but i have a split brain condition. /export/brick1 on storage1 doesn't contain datas....
What can i have to do to restore /export/brick1 on storage1 ???



P.S. Sorry I couldn't help earlier. I had a very busy week out of town.
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux