Re: trashcan on dist. repl. volume with geo-replication

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Kotresh,

thanks for your repsonse...
answers inside...

best regards
Dietmar


Am 13.03.2018 um 06:38 schrieb Kotresh Hiremath Ravishankar:
Hi Dietmar,

I am trying to understand the problem and have few questions.

1. Is trashcan enabled only on master volume?
no, trashcan is also enabled on slave. settings are the same as on master but trashcan on slave is complete empty.
root@gl-node5:~# gluster volume get mvol1 all | grep -i trash
features.trash                          on                                     
features.trash-dir                      .trashcan                              
features.trash-eliminate-path           (null)                                 
features.trash-max-filesize             2GB                                    
features.trash-internal-op              off                                    
root@gl-node5:~#

2. Does the 'rm -rf' done on master volume synced to slave ?
yes. entire content of ~/test1/b1/* on slave has been removed.
3. If trashcan is disabled, the issue goes away?

after disabling features.trash on master and slave the issue remains...stop and restart of master/slave volume and geo-replication has no effect.
root@gl-node1:~# gluster volume geo-replication mvol1 gl-node5-int::mvol1 status
 
MASTER NODE     MASTER VOL    MASTER BRICK     SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED                 
----------------------------------------------------------------------------------------------------------------------------------------------------
gl-node1-int    mvol1         /brick1/mvol1    root          gl-node5-int::mvol1    N/A             Faulty     N/A                N/A                         
gl-node3-int    mvol1         /brick1/mvol1    root          gl-node5-int::mvol1    gl-node7-int    Passive    N/A                N/A                         
gl-node2-int    mvol1         /brick1/mvol1    root          gl-node5-int::mvol1    N/A             Faulty     N/A                N/A                         
gl-node4-int    mvol1         /brick1/mvol1    root          gl-node5-int::mvol1    gl-node8-int    Active     Changelog Crawl    2018-03-12 13:56:28         
root@gl-node1:~#

The geo-rep error just says the it failed to create the directory "Oracle_VM_VirtualBox_Extension" on slave.
Usually this would be because of gfid mismatch but I don't see that in your case. So I am little more interested
in present state of the geo-rep. Is it still throwing same errors and same failure to sync the same directory. If
so does the parent 'test1/b1' exists on slave?
it is still throwing the same error as show below.
the directory 'test1/b1' is empty as expected and exist on master and slave.



And doing ls on trashcan should not affect geo-rep. Is there a easy reproducer for this ?
i have made several tests on 3.10.11 and 3.12.6 and i'm pretty sure there was one without activation of the trashcan feature on slave...with same / similiar problems.
i will come back with a more comprehensive and reproducible description of that issue...



Thanks,
Kotresh HR

On Mon, Mar 12, 2018 at 10:13 PM, Dietmar Putz <dietmar.putz@xxxxxxxxx> wrote:
Hello,

in regard to
https://bugzilla.redhat.com/show_bug.cgi?id=1434066
i have been faced to another issue when using the trashcan feature on a dist. repl. volume running a geo-replication. (gfs 3.12.6 on ubuntu 16.04.4)
for e.g. removing an entire directory with subfolders :
tron@gl-node1:/myvol-1/test1/b1$ rm -rf *

afterwards listing files in the trashcan :
tron@gl-node1:/myvol-1/test1$ ls -la /myvol-1/.trashcan/test1/b1/

leads to an outage of the geo-replication.
error on master-01 and master-02 :

[2018-03-12 13:37:14.827204] I [master(/brick1/mvol1):1385:crawl] _GMaster: slave's time stime=(1520861818, 0)
[2018-03-12 13:37:14.835535] E [master(/brick1/mvol1):784:log_failures] _GMaster: ENTRY FAILED    data="" 0, 'gfid': 'c38f75e3-194a-4d22-9094-50ac8f8756e7', 'gid': 0, 'mode': 16877, 'entry': '.gfid/5531bd64-ac50-462b-943e-c0bf1c52f52c/Oracle_VM_VirtualBox_Extension', 'op': 'MKDIR'}, 2, {'gfid_mismatch': False, 'dst': False})
[2018-03-12 13:37:14.835911] E [syncdutils(/brick1/mvol1):299:log_raise_exception] <top>: The above directory failed to sync. Please fix it to proceed further.


both gfid's of the directories as shown in the log :
brick1/mvol1/.trashcan/test1/b1 0x5531bd64ac50462b943ec0bf1c52f52c
brick1/mvol1/.trashcan/test1/b1/Oracle_VM_VirtualBox_Extension 0xc38f75e3194a4d22909450ac8f8756e7

the shown directory contains just one file which is stored on gl-node3 and gl-node4 while node1 and 2 are in geo replication error.
since the filesize limitation of the trashcan is obsolete i'm really interested to use the trashcan feature but i'm concerned it will interrupt the geo-replication entirely.
does anybody else have been faced with this situation...any hints, workarounds... ?

best regards
Dietmar Putz


root@gl-node1:~/tmp# gluster volume info mvol1

Volume Name: mvol1
Type: Distributed-Replicate
Volume ID: a1c74931-568c-4f40-8573-dd344553e557
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: gl-node1-int:/brick1/mvol1
Brick2: gl-node2-int:/brick1/mvol1
Brick3: gl-node3-int:/brick1/mvol1
Brick4: gl-node4-int:/brick1/mvol1
Options Reconfigured:
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on
features.trash-max-filesize: 2GB
features.trash: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

root@gl-node1:/myvol-1/test1# gluster volume geo-replication mvol1 gl-node5-int::mvol1 config
special_sync_mode: partial
gluster_log_file: /var/log/glusterfs/geo-replication/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1.gluster.log
ssh_command: ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem
change_detector: changelog
use_meta_volume: true
session_owner: a1c74931-568c-4f40-8573-dd344553e557
state_file: /var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/monitor.status
gluster_params: aux-gfid-mount acl
remote_gsyncd: /nonexistent/gsyncd
working_dir: /var/lib/misc/glusterfsd/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1
state_detail_file: /var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1-detail.status
gluster_command_dir: /usr/sbin/
pid_file: /var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/monitor.pid
georep_session_working_dir: /var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/
ssh_command_tar: ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem
master.stime_xattr_name: trusted.glusterfs.a1c74931-568c-4f40-8573-dd344553e557.d62bda3a-1396-492a-ad99-7c6238d93c6a.stime
changelog_log_file: /var/log/glusterfs/geo-replication/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1-changes.log
socketdir: /var/run/gluster
volume_id: a1c74931-568c-4f40-8573-dd344553e557
ignore_deletes: false
state_socket_unencoded: /var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1.socket
log_file: /var/log/glusterfs/geo-replication/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1.log
access_mount: true
root@gl-node1:/myvol-1/test1#

--

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users



--
Thanks and Regards,
Kotresh H R

-- 
Dietmar Putz
3Q GmbH
Kurfürstendamm 102
D-10711 Berlin
 
Mobile:   +49 171 / 90 160 39
Mail:     dietmar.putz@xxxxxxxxx
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux