I changed to replication to use the root user and re-created the replication with “create force”. Now the files and folders were replicated, and the permission denied, and New folder error disappeared, but old files are
not deleted. Looks like the history crawl is in some kind of a loop: [root@master ~]# gluster volume geo-replication status MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE
STATUS CRAWL STATUS LAST_SYNCED ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ master glustervol1 /bricks/brick1/brick root ssh://slave_1::glustervol1 slave_1
Active Hybrid Crawl N/A master glustervol1 /bricks/brick1/brick root ssh://slave_2::glustervol1 slave_2
Active Hybrid Crawl N/A master glustervol1 /bricks/brick1/brick root ssh://slave_3::glustervol1 slave_3
Active Hybrid Crawl N/A tail -f /var/log/glusterfs/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.log File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 104, in cl_history_changelog raise ChangelogHistoryNotAvailable() ChangelogHistoryNotAvailable [2018-09-25 14:10:44.196011] E [repce(worker /bricks/brick1/brick):197:__call__] RepceClient: call failed call=29945:139700517484352:1537884644.19 method=history
error=ChangelogHistoryNotAvailable [2018-09-25 14:10:44.196405] I [resource(worker /bricks/brick1/brick):1295:service_loop] GLUSTER: Changelog history not available, using xsync [2018-09-25 14:10:44.221385] I [master(worker /bricks/brick1/brick):1623:crawl] _GMaster: starting hybrid crawl stime=(0, 0) [2018-09-25 14:10:44.223382] I [gsyncdstatus(worker /bricks/brick1/brick):249:set_worker_crawl_status] GeorepStatus: Crawl Status Change status=Hybrid Crawl [2018-09-25 14:10:46.225296] I [master(worker /bricks/brick1/brick):1634:crawl] _GMaster: processing xsync changelog path=/var/lib/misc/gluster/gsyncd/glustervol1_slave_3_glustervol1/bricks-brick1-brick/xsync/XSYNC-CHANGELOG.1537884644 [2018-09-25 14:13:36.157408] I [gsyncd(config-get):297:main] <top>: Using session config file path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf [2018-09-25 14:13:36.377880] I [gsyncd(status):297:main] <top>: Using session config file path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf [2018-09-25 14:31:10.145035] I [master(worker /bricks/brick1/brick):1944:syncjob] Syncer: Sync Time Taken duration=1212.5316 num_files=1474 job=2 return_code=11 [2018-09-25 14:31:10.152637] E [syncdutils(worker /bricks/brick1/brick):801:errlog] Popen: command returned error cmd=rsync -aR0 --inplace --files-from=- --super --stats
--numeric-ids --no-implied-dirs --xattrs --acls . -e ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-gg758Z/caec4d1d03cc28bc1853f692e291164f.sock slave_3:/proc/15919/cwd
error=11 [2018-09-25 14:31:10.237371] I [repce(agent /bricks/brick1/brick):80:service_loop] RepceServer: terminating on reaching EOF. [2018-09-25 14:31:10.430820] I [gsyncdstatus(monitor):244:set_worker_status] GeorepStatus: Worker Status Change status=Faulty [2018-09-25 14:31:20.541475] I [monitor(monitor):158:monitor] Monitor: starting gsyncd worker brick=/bricks/brick1/brick slave_node=slave_3 [2018-09-25 14:31:20.806518] I [gsyncd(agent /bricks/brick1/brick):297:main] <top>: Using session config file path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf [2018-09-25 14:31:20.816536] I [changelogagent(agent /bricks/brick1/brick):72:__init__] ChangelogAgent: Agent listining... [2018-09-25 14:31:20.821574] I [gsyncd(worker /bricks/brick1/brick):297:main] <top>: Using session config file path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf [2018-09-25 14:31:20.882128] I [resource(worker /bricks/brick1/brick):1377:connect_remote] SSH: Initializing SSH connection between master and slave... [2018-09-25 14:31:24.169857] I [resource(worker /bricks/brick1/brick):1424:connect_remote] SSH: SSH connection between master and slave established. duration=3.2873 [2018-09-25 14:31:24.170401] I [resource(worker /bricks/brick1/brick):1096:connect] GLUSTER: Mounting gluster volume locally... [2018-09-25 14:31:25.354633] I [resource(worker /bricks/brick1/brick):1119:connect] GLUSTER: Mounted gluster volume duration=1.1839 [2018-09-25 14:31:25.355073] I [subcmds(worker /bricks/brick1/brick):70:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor [2018-09-25 14:31:27.439034] I [master(worker /bricks/brick1/brick):1593:register] _GMaster: Working dir path=/var/lib/misc/gluster/gsyncd/glustervol1_slave_3_glustervol1/bricks-brick1-brick [2018-09-25 14:31:27.441847] I [resource(worker /bricks/brick1/brick):1282:service_loop] GLUSTER: Register time time=1537885887 [2018-09-25 14:31:27.465053] I [gsyncdstatus(worker /bricks/brick1/brick):277:set_active] GeorepStatus: Worker Status Change status=Active [2018-09-25 14:31:27.471021] I [gsyncdstatus(worker /bricks/brick1/brick):249:set_worker_crawl_status] GeorepStatus: Crawl Status Change status=History Crawl [2018-09-25 14:31:27.471484] I [master(worker /bricks/brick1/brick):1507:crawl] _GMaster: starting history crawl turns=1 stime=(0, 0) entry_stime=None
etime=1537885887 [2018-09-25 14:31:27.472564] E [repce(agent /bricks/brick1/brick):105:worker] <top>: call failed: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 101, in worker res = getattr(self.obj, rmeth)(*in_data[2:]) File "/usr/libexec/glusterfs/python/syncdaemon/changelogagent.py", line 53, in history num_parallel) File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 104, in cl_history_changelog raise ChangelogHistoryNotAvailable() ChangelogHistoryNotAvailable [2018-09-25 14:31:27.480632] E [repce(worker /bricks/brick1/brick):197:__call__] RepceClient: call failed call=31250:140272364406592:1537885887.47 method=history
error=ChangelogHistoryNotAvailable [2018-09-25 14:31:27.480958] I [resource(worker /bricks/brick1/brick):1295:service_loop] GLUSTER: Changelog history not available, using xsync [2018-09-25 14:31:27.495117] I [master(worker /bricks/brick1/brick):1623:crawl] _GMaster: starting hybrid crawl stime=(0, 0) [2018-09-25 14:31:27.502083] I [gsyncdstatus(worker /bricks/brick1/brick):249:set_worker_crawl_status] GeorepStatus: Crawl Status Change status=Hybrid Crawl [2018-09-25 14:31:29.505284] I [master(worker /bricks/brick1/brick):1634:crawl] _GMaster: processing xsync changelog path=/var/lib/misc/gluster/gsyncd/glustervol1_slave_3_glustervol1/bricks-brick1-brick/xsync/XSYNC-CHANGELOG.1537885887 tail -f /var/log/glusterfs/geo-replication-slaves/glustervol1_slave_3_glustervol1/gsyncd.log [2018-09-25 13:49:24.141303] I [repce(slave master/bricks/brick1/brick):80:service_loop] RepceServer: terminating on reaching EOF. [2018-09-25 13:49:36.602051] W [gsyncd(slave master/bricks/brick1/brick):293:main] <top>: Session config file not exists, using the default config path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf [2018-09-25 13:49:36.629415] I [resource(slave master/bricks/brick1/brick):1096:connect] GLUSTER: Mounting gluster volume locally... [2018-09-25 13:49:37.701642] I [resource(slave master/bricks/brick1/brick):1119:connect] GLUSTER: Mounted gluster volume duration=1.0718 [2018-09-25 13:49:37.704282] I [resource(slave master/bricks/brick1/brick):1146:service_loop] GLUSTER: slave listening [2018-09-25 14:10:27.70952] I [repce(slave master/bricks/brick1/brick):80:service_loop] RepceServer: terminating on reaching EOF. [2018-09-25 14:10:39.632124] W [gsyncd(slave master/bricks/brick1/brick):293:main] <top>: Session config file not exists, using the default config path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf [2018-09-25 14:10:39.650958] I [resource(slave master/bricks/brick1/brick):1096:connect] GLUSTER: Mounting gluster volume locally... [2018-09-25 14:10:40.729355] I [resource(slave master/bricks/brick1/brick):1119:connect] GLUSTER: Mounted gluster volume duration=1.0781 [2018-09-25 14:10:40.730650] I [resource(slave master/bricks/brick1/brick):1146:service_loop] GLUSTER: slave listening [2018-09-25 14:31:10.291064] I [repce(slave master/bricks/brick1/brick):80:service_loop] RepceServer: terminating on reaching EOF. [2018-09-25 14:31:22.802237] W [gsyncd(slave master/bricks/brick1/brick):293:main] <top>: Session config file not exists, using the default config path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf [2018-09-25 14:31:22.828418] I [resource(slave master/bricks/brick1/brick):1096:connect] GLUSTER: Mounting gluster volume locally... [2018-09-25 14:31:23.910206] I [resource(slave master/bricks/brick1/brick):1119:connect] GLUSTER: Mounted gluster volume duration=1.0813 [2018-09-25 14:31:23.913369] I [resource(slave master/bricks/brick1/brick):1146:service_loop] GLUSTER: slave listening Any ideas how to resolve this without re-creating everything again? Can I reset the changelog history?
Regards, Christian From: <gluster-users-bounces@xxxxxxxxxxx> on behalf of "Kotte, Christian (Ext)" <christian.kotte@xxxxxxxxxxxx> I don’t configure the permissions of /bricks/brick1/brick/.glusterfs. I don’t even see it on the local GlusterFS mount. Not sure why the permissions are configured with S and the AD group… Regards, Christian From: <gluster-users-bounces@xxxxxxxxxxx> on behalf of "Kotte, Christian (Ext)" <christian.kotte@xxxxxxxxxxxx> Yeah right. I get permission denied. [geoaccount@slave ~]$ ll /bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e ls: cannot access /bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e: Permission denied [geoaccount@slave ~]$ ll /bricks/brick1/brick/.glusterfs/29/d1/ ls: cannot access /bricks/brick1/brick/.glusterfs/29/d1/: Permission denied [geoaccount@slave ~]$ ll /bricks/brick1/brick/.glusterfs/29/ ls: cannot access /bricks/brick1/brick/.glusterfs/29/: Permission denied [geoaccount@slave ~]$ ll /bricks/brick1/brick/.glusterfs/ ls: cannot open directory /bricks/brick1/brick/.glusterfs/: Permission denied [root@slave ~]# ll /bricks/brick1/brick/.glusterfs/29 total 0 drwx--S---+ 2 root
AD+group 50 Sep 10 07:29 16 drwx--S---+ 2 root AD+group 50 Sep 10 07:29 33 drwx--S---+ 2 root AD+group 50 Sep 10 07:29 5e drwx--S---+ 2 root AD+group 50 Sep 10 07:29 73 drwx--S---+ 2 root AD+group 50 Sep 10 07:29 b2 drwx--S---+ 2 root AD+group 50 Sep 21 09:39 d1 drwx--S---+ 2 root AD+group 50 Sep 10 07:29 d7 drwx--S---+ 2 root AD+group 50 Sep 10 07:29 e6 drwx--S---+ 2 root AD+group 50 Sep 10 07:29 eb [root@slave ~]# However, the strange thing is that I could replicate new files and folders before. The replication is broken since the “New folder” was created. These are the permissions on a dev/test system: [root@slave-dev ~]# ll /bricks/brick1/brick/.glusterfs/ total 3136 drwx------. 44 root root 4096 Aug 22 18:19 00 drwx------. 50 root root 4096 Sep 12 13:14 01 drwx------. 54 root root 4096 Sep 13 11:33 02 drwx------. 59 root root 4096 Aug 22 18:21 03 drwx------. 60 root root 4096 Sep 12 13:14 04 drwx------. 68 root root 4096 Aug 24 12:36 05 drwx------. 56 root root 4096 Aug 22 18:21 06 drwx------. 46 root root 4096 Aug 22 18:21 07 drwx------. 51 root root 4096 Aug 22 18:21 08 drwx------. 42 root root 4096 Aug 22 18:21 09 drwx------. 44 root root 4096 Sep 13 11:16 0a I’ve configured an AD group, SGID bit, and ACLs via Ansible on the local mount point. Could this be an issue? Should I avoid configuring the permissions on .glusterfs and below? # ll /mnt/glustervol1/ total 12 drwxrwsr-x+ 4 AD+user AD+group 4096 Jul 13 07:46 Scripts drwxrwxr-x+ 10 AD+user AD+group 4096 Jun 12 12:03 Software -rw-rw-r--+ 1 root AD+group 0 Aug 8 08:44 test drwxr-xr-x+ 6 AD+user AD+group 4096 Apr 18 10:58 tftp glusterfs_volumes: […] permissions: mode: "02775" owner: root group: "AD+group" acl_permissions: rw […] # root directory is owned by root. # set permissions to 'g+s' to automatically set the group to "AD+group" # permissions of individual files will be set by Samba during creation - name: Configure volume directory permission 1/2 tags: glusterfs file: path: /mnt/{{ item.volume }} state: directory mode: "{{ item.permissions.mode }}" owner: "{{ item.permissions.owner }}" group: "{{ item.permissions.group }}" with_items: "{{ glusterfs_volumes }}" loop_control: label: "{{ item.volume }}" when: item.permissions is defined # ACL needs to be set to override default umask and grant "AD+group" write permissions - name: Configure volume directory permission 2/2 (ACL) tags: glusterfs acl: path: /mnt/{{ item.volume }} default: yes entity: "{{ item.permissions.group }}" etype: group permissions: "{{ item.permissions.acl_permissions }}" state: present with_items: "{{ glusterfs_volumes }}" loop_control: label: "{{ item.volume }}" when: item.permissions is defined Regards, Christian From: Kotresh Hiremath Ravishankar <khiremat@xxxxxxxxxx> I think I am get what's happening. The geo-rep session is non-root. Could you do readlink on brick path mentioned above
/bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e from a geaccount user and see if you are getting "Permission Denied" errors? Thanks, Kotresh HR On Mon, Sep 24, 2018 at 7:35 PM Kotte, Christian (Ext) <christian.kotte@xxxxxxxxxxxx> wrote:
Thanks and Regards, Kotresh H R |
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users