Re: Permission denied at some directories/files after a split brain

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On February 10, 2020 3:53:08 PM GMT+02:00, Alberto Bengoa <bengoa@xxxxxxxxx> wrote:
>Hello guys,
>
>We are running GlusterFS 6.6 in Replicate mode (1 x 3). After a
>split-brain
>and a massive heal process, we noticed that our app started to receive
>thousands of permissions denied while trying to access files and
>directories.
>
>Exemple log of a failed access atempt to a specific directory:
>
>[2020-02-10 10:38:17.402080] I [MSGID: 139001]
>[posix-acl.c:263:posix_acl_log_permit_denied]
>0-app_data-access-control:
>client:
>CTX_ID:7d744c50-43a1-4f81-9330-001b5dcaddb7-GRAPH_ID:0-PID:2310-HOST:ast10.local.domain-PC_NAME:app_data-client-1-RECON_NO:-1,
>gfid: 092f1e28-d6a8-4ca9-95d5-75dc8ad1c835,
>req(uid:498,gid:498,perm:4,ngrps:1),
>ctx(uid:0,gid:0,in-groups:0,perm:000,updated-fop:INVALID, acl:-)
>[Permission denied]
>[2020-02-10 10:38:17.402182] E [MSGID: 115056]
>[server-rpc-fops_v2.c:687:server4_opendir_cbk] 0-app_data-server:
>6257941:
>OPENDIR /mailboxes.old/8692/211411002/Old
>(092f1e28-d6a8-4ca9-95d5-75dc8ad1c835), client:
>CTX_ID:7d744c50-43a1-4f81-9330-001b5dcaddb7-GRAPH_ID:0-PID:2310-HOST:ast10.local.domain-PC_NAME:app_data-client-1-RECON_NO:-1,
>error-xlator: app_data-access-control [Permission denied]
>
>The permission denied happens only to unprivileged users, even if that
>unprivileged user is the directory owner. The root user is able to
>access
>all files, and if we "touch" the file/directory as root it *sometimes*
>fixes the problem.
>
>We noticed inconsistent Access/Change dates. Here a stat of a directory
>before touching it, showing these inconsistencies:
>
>  File: ‘Old’
>  Size: 4096       Blocks: 8          IO Block: 131072 directory
>Device: 27h/39d Inode: 10388898073370567318  Links: 2
>Access: (2775/drwxrwsr-x)  Uid: (  498/app)   Gid: (  498/app)
>Access: 1970-01-01 01:00:00.000000000 +0100
>Modify: 2020-02-07 13:21:10.365297527 +0000
>Change: 1970-01-01 01:00:00.000000000 +0100
> Birth: -
>
>I think this case is similar to the reported here[1] and discussed at
>thread "ACL issue v6.6, v6.7, v7.1, v7.2", despite the fact that we are
>not
>using libvirt. We do use ACLs, but not in this particular directory.
>
>Any thoughts on this?
>
>[1] - https://bugzilla.redhat.com/show_bug.cgi?id=1797099
>
>Thanks,
>Alberto Bengoa

Hi Alberto,
Sadly you should verify if the issue is the same.
Enable the trace logs for the bricks and verify if the errors in the logs with those in the bugzilla.
Don't forget to stop the trace log or your logs' dir will get full.

What version of gluster are you using ?
In my case only a  downgrade has restored the operation of the cluster, so you should consider that as an option (last, but still an option).

You can try to run a find against the fuse and 'find  /path/to/fuse -exec setfacl -m u:root:rw {} \;'
Maybe that will force gluster to read the ACLs again.

Good luck!
If you have the option, join the next gluster meeting and ask for an update (if the issue is actually the same).

Best Regards,
Strahil Nikolov
________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users




[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux