Re: Brick is not connected (and other funny things)

Marco Marino <marino.mrc@xxxxxxxxx> · Wed, 8 Oct 2014 18:09:29 +0200

Can someone help me?I'd like to restore my /export/brick1 on server1. Actually i have datas only on server2. 
I think that right instructions are:
1) setfattr -n ... on server1 ( this is a bug. Here more info -> http://www.joejulian.name/blog/replacing-a-brick-on-glusterfs-340/   I have the same error in logs)
2) Now i think i can re-start volume, so i should see an automatic healing procedure
3) All datas are replicated on server1

Can i have a confirmation about this procedure? Other volumes are affected? Please, i cannot loose my data

Thanks

2014-10-03 20:07 GMT+02:00 Marco Marino <marino.mrc@xxxxxxxxx>:
Hi,I'm trying to use glusterfs with my openstack private cloud for storing ephemeral disks. In this way, each compute node mount glusterfs in /nova and save instances on a remote glusterfs (shared between the compute nodes, so live migration is very fast).
I have 2 storage node (storage1 and storage2) with replica 2.
In a first configuration i've used nfs on the clients. In /etc/fstab of the compute nodes i have:
storage1:/cloud_rootdisk /nova nfs mountproto=tcp,vers=3 0 0

This creates a single point of failure because if storage1 goes down, i have to remount manually on storage2. And this causes the complete disk corruption of all VMs that running on all the compute nodes. Really funny... 

In a second configuration, i've used the gluster native client with "backupvolfile-server=storage2". i've made few tests, but it seems to work.
What i've tested:
on the compute node i have:
mount -t glusterfs -o backupvolfile-server=server2,fetch-attempts=2,log-level=WARNING,log-file=/var/log/gluster.log server1:/test-volume /gluster_mount

Then, I booted a vm and started to download a large file (1GB) from the vm (so, i'm writing on the ephemeral disk stored via glusterfs). During this download, i rebooted storage1 and the VM seems to be not corrupted (so, the vm write only on storage2).
Can i have a confirmation about this? Is this the right way?

Next question:
When i rebooted storage1, it fails to start. it tells me that /dev/sdc1 (the partition that i'm using for the test) is corrupted. It could be a normal behavior because the server goes down during a write. So, started the storage1 in single user mode and xfs_repair /dev/sdc1. This make me able to start storage1. (yuppy)
Glusterfs starts correctly, but now i have "brick1 is not connected", where /export/brick1 is the brick that i'm using on storage1 for the volume used for tests.
On storage2 all is ok, but i have a split brain condition. /export/brick1 on storage1 doesn't contain datas....
What can i have to do to restore /export/brick1 on storage1 ???

Thanks
MM

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users