Re: Bareos backup from Gluster mount

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Wed, Jul 29, 2015 at 5:17 PM Michael Mol <mikemol@xxxxxxxxx> wrote:
On Mon, Jul 27, 2015 at 5:03 PM Ryan Clough <ryan.clough@xxxxxxxx> wrote:
Hello,

I have cross-posted this question in the bareos-users mailing list.

Wondering if anyone has tried this because I am unable to backup data that is mounted via Gluster Fuse or Gluster NFS. Basically, I have the Gluster volume mounted on the Bareos Director which also has the tape changer attached.

Here is some information about versions:
Bareos version 14.2.2
Gluster version 3.7.2
Scientific Linux version 6.6

Our Gluster volume consists of two nodes in distribute only. Here is the configuration of our volume:
[root@hgluster02 ~]# gluster volume info
 
Volume Name: export_volume
Type: Distribute
Volume ID: c74cc970-31e2-4924-a244-4c70d958dadb
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: hgluster01:/gluster_data
Brick2: hgluster02:/gluster_data
Options Reconfigured:
performance.io-thread-count: 24
server.event-threads: 20
client.event-threads: 4
performance.readdir-ahead: on
features.inode-quota: on
features.quota: on
nfs.disable: off
auth.allow: 192.168.10.*,10.0.10.*,10.8.0.*,10.2.0.*,10.0.60.*
server.allow-insecure: on
server.root-squash: on
performance.read-ahead: on
features.quota-deem-statfs: on
diagnostics.brick-log-level: WARNING

When I try to backup a directory from Gluster Fuse or Gluster NFS mount and I monitor the network communication I only see data being pulled from the hgluster01 brick. When the job finishes Bareos thinks that it completed without error but included in the messages for the job are lots and lots of permission denied errors like this:
15-Jul 02:03 ripper.red.dsic.com-fd JobId 613:      Cannot open "/export/rclough/psdv-2014-archives-2/scan_111.tar.bak": ERR=Permission denied.
15-Jul 02:03 ripper.red.dsic.com-fd JobId 613:      Cannot open "/export/rclough/psdv-2014-archives-2/run_219.tar.bak": ERR=Permission denied.
15-Jul 02:03 ripper.red.dsic.com-fd JobId 613:      Cannot open "/export/rclough/psdv-2014-archives-2/scan_112.tar.bak": ERR=Permission denied.
15-Jul 02:03 ripper.red.dsic.com-fd JobId 613:      Cannot open "/export/rclough/psdv-2014-archives-2/run_220.tar.bak": ERR=Permission denied.
15-Jul 02:03 ripper.red.dsic.com-fd JobId 613:      Cannot open "/export/rclough/psdv-2014-archives-2/scan_114.tar.bak": ERR=Permission denied.

At first I thought this might be a root-squash problem but, if I try to read/copy a file using the root user from the Bareos server that is trying to do the backup, I can read files just fine.

When the job finishes is reports that it finished "OK -- with warnings" but, again the log for the job is filled with "ERR=Permission denied" messages. In my opinion, this job did not finish OK and should be Failed. Some of the files from the HGluster02 brick are backed up but all of the ones with permission errors do not. When I restore the job, all of the files with permission errors are empty.

Has anyone successfully used Bareos to backup data from Gluster mounts? This is an important use case for us because this is the largest single volume that we have to prepare large amounts of data to be archived.

Thank you for your time,

How did I not see this earlier? I'm seeing a very similar problem.  I just posted this to the bareos-user list:

Help! I've run out of know-how while trying to fix this myself...

Environment: CentOS 7, x86_64
Bareos version: 14.2.2-46.1.el7 (via http://download.bareos.org/bareos/release/14.2/CentOS_7/ repo)

Symptom: Bareos attempts to mount a volume, and spits back a Permission Denied error, as though it didn't have permission to access the relevant file.

I've been seeing this at least since Gluster version 3.7.2, which I updated to owing to a need to expand my backend storage (and 3.7.1, which worked fine) had a bug that broke bricks while rebalancing.

I've verified that the bareos storage daemon is running as the bareos user, and I've also, by way of FUSE mount into the gluster volume, verified ownership of the volume:

# ls -l Email-Incremental-0155 
-rw-r-----. 1 bareos bareos 1073728379 Jun 10 21:04 Email-Incremental-0155

And uid/gid, for reference:

# ls -ln Email-Incremental-0155 
-rw-r-----. 1 997 995 1073728379 Jun 10 21:04 Email-Incremental-0155

And in the gluster volume, the storage owner-{uid,gid}:
# gluster volume info bareos

Volume Name: bareos
Type: Distribute
Volume ID: f4cb7aac-3631-41cc-9afa-f182a514d116
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: backup-stor-1[censored]:/var/gluster/bareos/brick-bareos
Brick2: backup-stor-2[censored]:/var/gluster/bareos/brick-bareos
Options Reconfigured:
server.allow-insecure: on
performance.readdir-ahead: off
nfs.disable: on
performance.cache-size: 128MB
performance.write-behind-window-size: 256MB
performance.cache-refresh-timeout: 10
performance.io-thread-count: 16
performance.cache-max-file-size: 4TB
performance.flush-behind: on
performance.client-io-threads: on
storage.owner-uid: 997
storage.owner-gid: 995
features.bitrot: off
features.scrub: Inactive
features.scrub-freq: daily
features.scrub-throttle: lazy

In this run, the storage daemon and the file daemon happen to be on the same node. Here's trace output at level 200, obtained running "tail -f *.trace" in bareos-sd's cwd:

==> backup-director-sd.trace <==
backup-director-sd: fd_cmds.c:219-0 <filed: append open session
backup-director-sd: fd_cmds.c:303-0 Append open session: append open session
backup-director-sd: fd_cmds.c:314-0 >filed: 3000 OK open ticket = 1
backup-director-sd: fd_cmds.c:219-0 <filed: append data 1
backup-director-sd: fd_cmds.c:265-0 Append data: append data 1
backup-director-sd: fd_cmds.c:267-0 <filed: append data 1
backup-director-sd: append.c:69-0 Start append data. res=1
backup-director-sd: acquire.c:369-0 acquire_append device is disk
backup-director-sd: acquire.c:404-0 jid=924 Do mount_next_write_vol
backup-director-sd: mount.c:71-0 Enter mount_next_volume(release=0) dev="GlusterStorage4" (gluster://backup-stor-1[censored]/bareos/bareos)
backup-director-sd: mount.c:84-0 mount_next_vol retry=0
backup-director-sd: mount.c:604-0 No swap_dev set
backup-director-sd: askdir.c:246-0 >dird CatReq Job=server2-email.2015-07-29_16.32.34_09 GetVolInfo VolName=Email-Incremental-0155 write=1
backup-director-sd: askdir.c:175-0 <dird 1000 OK VolName=Email-Incremental-0155 VolJobs=0 VolFiles=0 VolBlocks=0 VolBytes=1 VolMounts=3 VolErrors=0 VolWrites=16646 MaxVolBytes=1073741824 VolCapacityBytes=0 VolStatus=Recycle Slot=0 MaxVolJobs=0 MaxVolFiles=0 InChanger=0 VolReadTime=0 VolWriteTime=8455280 EndFile=0 EndBlock=1073728378 LabelType=0 MediaId=156 EncryptionKey= MinBlocksize=0 MaxBlocksize=0
backup-director-sd: askdir.c:211-0 do_get_volume_info return true slot=0 Volume=Email-Incremental-0155, VolminBlocksize=0 VolMaxBlocksize=0
backup-director-sd: askdir.c:213-0 setting dcr->VolMinBlocksize(0) to vol.VolMinBlocksize(0)
backup-director-sd: askdir.c:215-0 setting dcr->VolMaxBlocksize(0) to vol.VolMaxBlocksize(0)
backup-director-sd: mount.c:122-0 After find_next_append. Vol=Email-Incremental-0155 Slot=0
backup-director-sd: autochanger.c:99-0 Device "GlusterStorage4" (gluster://backup-stor-1[censored]/bareos/bareos) is not an autochanger
backup-director-sd: mount.c:144-0 autoload_dev returns 0
backup-director-sd: mount.c:175-0 want vol=Email-Incremental-0155 devvol= dev="GlusterStorage4" (gluster://backup-stor-1[censored]/bareos/bareos)
backup-director-sd: dev.c:536-0 open dev: type=5 dev_name="GlusterStorage4" (gluster://backup-stor-1[censored]/bareos/bareos) vol=Email-Incremental-0155 mode=OPEN_READ_WRITE
backup-director-sd: dev.c:540-0 call open_device mode=OPEN_READ_WRITE
backup-director-sd: dev.c:941-0 Enter mount
backup-director-sd: dev.c:610-0 open disk: mode=OPEN_READ_WRITE open(gluster://backup-stor-1[censored]/bareos/bareos/Email-Incremental-0155, 0x2, 0640)

==> backup-director-fd.trace <==

==> backup-director-sd.trace <==
backup-director-sd: dev.c:617-0 open failed: dev.c:616 Could not open: gluster://backup-stor-1[censored]/bareos/bareos/Email-Incremental-0155, ERR=Permission denied

In response to Pranith's suggestion to Ryan to look at logs, I did find this interesting in root-bareos.log when I FUSE-mounted the volume. (Interesting, because everything is running the same version of gluster, at least as far as packages are telling me.)

==> root-bareos.log <==
[2015-07-29 21:26:39.465191] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-bareos-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2015-07-29 21:26:39.465737] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-bareos-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2015-07-29 21:26:39.465935] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-bareos-client-1: Connected to bareos-client-1, attached to remote volume '/var/gluster/bareos/brick-bareos'.
[2015-07-29 21:26:39.465999] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-bareos-client-1: Server and Client lk-version numbers are not same, reopening the fds
[2015-07-29 21:26:39.466319] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-bareos-client-0: Connected to bareos-client-0, attached to remote volume '/var/gluster/bareos/brick-bareos'.
[2015-07-29 21:26:39.466344] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-bareos-client-0: Server and Client lk-version numbers are not same, reopening the fds
[2015-07-29 21:26:39.471772] I [fuse-bridge.c:5053:fuse_graph_setup] 0-fuse: switched to graph 0
[2015-07-29 21:26:39.471953] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-bareos-client-1: Server lk version = 1
[2015-07-29 21:26:39.472000] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-bareos-client-0: Server lk version = 1
[2015-07-29 21:26:39.473230] I [fuse-bridge.c:3979:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22 kernel 7.22

On both bricks, there's this or similar, but the timestamps don't correlate with the bareos errors:

The message "W [MSGID: 101095] [xlator.c:143:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/3.7.3/xlator/features/bitrot.so: cannot open shared object file: No such file or directory" repeated 3 times between [2015-07-29 19:50:34.593333] and [2015-07-29 19:50:34.593486]

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux