Re: Rsync in place of heal after brick failure

Ashish Pandey <aspandey@xxxxxxxxxx> · Tue, 9 Apr 2019 01:23:24 -0400 (EDT)

From: "Poornima Gurusiddaiah" <pgurusid@xxxxxxxxxx>
To: "Tom Fite" <tomfite@xxxxxxxxx>
Cc: "Gluster-users" <gluster-users@xxxxxxxxxxx>
Sent: Tuesday, April 9, 2019 9:53:02 AM
Subject: Re:  Rsync in place of heal after brick failure

On Mon, Apr 8, 2019, 6:31 PM Tom Fite <tomfite@xxxxxxxxx> wrote:
Thanks for the idea, Poornima. Testing shows that xfsdump and xfsrestore is much faster than rsync since it handles small files much better. I don't have extra space to store the dumps but I was able to figure out how to pipe the xfsdump and restore via ssh. For anyone else that's interested:
On source machine, run:

xfsdump -J - /dev/mapper/[vg]-[brick] | ssh root@[destination fqdn]  xfsrestore -J - [/path/to/brick]

That's great. Is it possible for you to write a short summary on this in your blog or in the Gluster/blogs [1]? The summary would be very helpful for other users as well. If you could also include details on the approaches you explored and the time each would take for the 65 TB data. Thanks in advance.

We will also see how we could incorporate this in replace brick/offline migration.

[1] https://gluster.github.io/devblog/write-for-gluster

Thanks,
Poornima

-Tom

On Mon, Apr 1, 2019 at 9:56 PM Poornima Gurusiddaiah <pgurusid@xxxxxxxxxx> wrote:
You could also try xfsdump and xfsrestore if you brick filesystem is xfs and the destination disk can be attached locally? This will be much faster.
Regards,
Poornima

On Tue, Apr 2, 2019, 12:05 AM Tom Fite <tomfite@xxxxxxxxx> wrote:
Hi all,
I have a very large (65 TB) brick in a replica 2 volume that needs to be re-copied from scratch. A heal will take a very long time with performance degradation on the volume so I investigated using rsync to do the brunt of the work.

The command:

rsync -av -H -X --numeric-ids --progress server1:/data/brick1/gv0 /data/brick1/

Running with -H assures that the hard links in .glusterfs are preserved, and -X preserves all of gluster's extended attributes.

I've tested this on my test environment as follows:

1. Stop glusterd and kill procs
2. Move brick volume to backup dir
3. Run rsync
4. Start glusterd
5. Observe gluster status

Just want to add one step to quickly test this.
You can kill other brick which you did not touch and then try to access your volume. This will ensure that all the file operations are falling on this
new brick and you can see if everything is accessible. 

All appears to be working correctly. Gluster status reports all bricks online, all data is accessible in the volume, and I don't see any errors in the logs.

Anybody else have experience trying this?

Thanks
-Tom
_______________________________________________
 Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users