Re: What is the recommended backup strategy for GlusterFS?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Aravinda,
 
I was testing glusterfind and wondering if you could provide some feedback.
 
My system is RH7.1 and I am using gluster 3.7.5.  My setup for testing is a single brick with the parameters shown below...
I was testing glusterfind by copying over my source code and then running 'glusterfind pre' (code is ~140,000 files).  The results of the test is that "glusterfind pre" took over an hour to process these 140,000 files and sat at 100% cpu-utilization for the extent of the run.  Is this expected and is this the expected rate for "glusterfind pre" to process files? 
 
The reason I am asking is because my production gluster system sees approximately 2-million files changes per day.  At this pace, glusterfind cannot process the requests fast enough to keep up.
 
I also went back and tested file deletion through a removal of this directory.  Looking at the /usr/var/lib/misc/glusterfsd/glusterfind/backup/gfs
/tmp_output_0 file, it looks like it is only processing 1000-files per hour for file deletions.
 
 
[root@ff01bkp gfs]# gluster volume info
Volume Name: gfs
Type: Distribute
Volume ID: 7bbdfcf8-1801-4a2a-9233-0a3261cbcba7
Status: Started
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: ffib01bkp:/data/brick01/gfs
Options Reconfigured:
diagnostics.client-log-level: WARNING
diagnostics.brick-log-level: WARNING
server.allow-insecure: on
performance.readdir-ahead: on
storage.build-pgfid: on
changelog.changelog: on
changelog.capture-del-path: on
changelog.rollover-time: 90
changelog.fsync-interval: 30
client.event-threads: 8
server.event-threads: 8
 
------ Original Message ------
From: "Aravinda" <avishwan@xxxxxxxxxx>
To: "Mathieu Chateau" <mathieu.chateau@xxxxxxx>; "M S Vishwanath Bhat" <msvbhat@xxxxxxxxx>
Cc: "gluster-users" <gluster-users@xxxxxxxxxxx>
Sent: 9/7/2015 2:02:09 AM
Subject: Re: What is the recommended backup strategy for GlusterFS?
 
We have one more tool. glusterfind!

This tool comes with gluster installaton, if you are using Gluster 3.7.  glusterfind enables Changelogging(Journal) to Gluster Volume and uses that information to detect the changes happened in the Volume.

1. Create a glusterfind session using, glusterfind create <SESSION_NAME> <VOLUME_NAME>
2. Do a full backup.
3. Run glusterfind pre command to generate the output file with the list of changes happened in Gluster Volume after glusterfind create. For usage information glusterfind pre --help
4. Consume that output file and backup only the files listed in output file.
5. After consuming the output file, run glusterfind post command. (glusterfind post --help)

Doc link: http://gluster.readthedocs.org/en/latest/GlusterFS%20Tools/glusterfind/index.html

This tool is newly released with Gluster release 3.7, please report issues or request for features here https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
regards
Aravinda
On 09/06/2015 12:37 AM, Mathieu Chateau wrote:
Hello,

for my needs, it's about having a simple "photo" of files present 5 days ago for example.
But i do not want to store file data twice, as most file didn't change.
Using snapshot is convenient of course, but it's risky as you loose both data and snapshot in case of failure (snapshot only contains delta blocks).
Rsync with hardlink is more resistant (inode stay until last reference is removed)

But interested to hear about production setup relying on it

Cordialement,
Mathieu CHATEAU
http://www.lotp.fr

2015-09-05 21:03 GMT+02:00 M S Vishwanath Bhat <msvbhat@xxxxxxxxx>:

MS
On 5 Sep 2015 12:57 am, "Mathieu Chateau" <mathieu.chateau@xxxxxxx> wrote:
>
> Hello,
>
> so far I use rsnapshot. This script do rsync with rotation, and most important same files are stored only once through hard link (inode). I save space, but still rsync need to parse all folders to know for new files.
>
> I am also interested in solution 1), but need to be stored on distinct drives/servers. We can't afford to loose data and snapshot in case of human error or disaster.
>
>
>
> Cordialement,
> Mathieu CHATEAU
> http://www.lotp.fr
>
> 2015-09-03 13:05 GMT+02:00 Merlin Morgenstern <merlin.morgenstern@xxxxxxxxx>:
>>
>> I have about 1M files in a GlusterFS with rep 2 on 3 nodes runnnig gluster 3.7.3.
>>
>> What would be a recommended automated backup strategy for this setup?
>>
>> I already considered the following:

Have you considered glusterfs geo-rep? It's actually for disaster recovery. But might suit your backup use case as well.

My two cents

//MS

>>
>> 1) glusterfs snapshots in combination with dd. This unfortunatelly was not possible so far as I could not find any info on how to make a image file out of the snapshots and how to automate the snapshot procedure.
>>
>> 2) rsync the mounted file share to a second directory and do a tar on the entire directory after rsync completed
>>
>> 3) combination of 1 and 2. Doing a snapshot that gets mounted automaticaly and then rsync from there. Problem: How to automate snapshots and how to know the mount path
>>
>> Currently I am only able to do the second option, but the fist option seems to be the most atractive.
>>
>> Thank you for any help on this.
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users@xxxxxxxxxxx
>> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@xxxxxxxxxxx
> http://www.gluster.org/mailman/listinfo/gluster-users




_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux