I've run across another problem - this one I'm pretty sure is a
problem with Gluster. I've opened
https://bugzilla.redhat.com/show_bug.cgi?id=955753. I'm using Oracle DNFS still and it's erroring out on some of its logfiles: ARC3: Error 19508 Closing archive log file '/db/flash_recovery_area/ALTUS/archivelog/2013_04_22/o1_mf_1_1093__1366653401581181_.arc' Gluster is reporting: [2013-04-22 13:57:22.073354] W [client3_1-fops.c:707:client3_1_truncate_cbk] 0-gv0-client-9: remote operation failed: Permission denied [2013-04-22 13:57:22.073496] W [client3_1-fops.c:707:client3_1_truncate_cbk] 0-gv0-client-8: remote operation failed: Permission denied [2013-04-22 13:57:22.073805] W [nfs3.c:889:nfs3svc_truncate_cbk] 0-nfs: 8b534455: /fleming1/db0/ALTUS_flash/archivelog/2013_04_22/.o1_mf_1_1093__1366653401581181_.arc => -1 (Permission denied) [2013-04-22 13:57:22.082594] E [nfs3.c:3408:nfs3_remove_resume] 0-nfs-nfsv3: Unable to resolve FH: (192.168.10.3:46391) gv0 : 82c4c5ec-f3ad-4074-ac66-c5a455146d71 Immediately prior to this, that file has attributes: Regular File mode:0640 uid:500 gid:1000, size: 476959744 The actual NFS RPC causing this error is [1]. Briefly: Remote Procedure Call, Type:Call XID:0x8b534455 Network File System, SETATTR Call FH:0x5c191ad8 new_attributes mode: value follows set_it: value follows (1) Mode: 0440, S_IRUSR, S_IRGRP size: value follows set_it: value follows (1) size: 476959744 In other words, a "truncate" and "chmod 440" in the same call. Gluster is replying with [2]: Remote Procedure Call, Type:Reply XID:0x8b534455 Network File System, SETATTR Reply Error:NFS3ERR_ACCES Status: NFS3ERR_ACCES (13) What's happening is that gluster is processing the mode change before the truncate, causing the truncate to fail. Incidentally, this also causes gluster to think that these files need healing: Gathering Heal info on volume gv0 has been successful … Brick fearless1:/export/bricks/500117310007a7ec/glusterdata /fleming1/db0/ALTUS_flash/archivelog/2013_04_22/.o1_mf_1_1093__1366653401581181_.arc … Brick fearless2:/export/bricks/500117310007a74c/glusterdata /fleming1/db0/ALTUS_flash/archivelog/2013_04_22/.o1_mf_1_1093__1366653401581181_.arc So, arguably gluster should be doing the truncate before the chmod. Perhaps the Most Correct thing is to always chmod last if removing permissions. That's a longer discussion :p [1] Full RPC Call Remote Procedure Call, Type:Call XID:0x8b534455 Fragment header: Last fragment, 172 bytes 1... .... .... .... .... .... .... .... = Last Fragment: Yes .000 0000 0000 0000 0000 0000 1010 1100 = Fragment Length: 172 XID: 0x8b534455 (2337490005) Message Type: Call (0) RPC Version: 2 Program: NFS (100003) Program Version: 3 Procedure: SETATTR (2) [The reply to this request is in frame 293325] Credentials Flavor: AUTH_UNIX (1) Length: 52 Stamp: 0xabcdefab Machine Name: fleming1.netdirect.ca length: 21 contents: fleming1.netdirect.ca fill bytes: opaque data UID: 500 GID: 1000 Auxiliary GIDs GID: 1000 GID: 1030 Verifier Flavor: AUTH_NULL (0) Length: 0 Network File System, SETATTR Call FH:0x5c191ad8 [Program Version: 3] [V3 Procedure: SETATTR (2)] object length: 36 [hash (CRC-32): 0x5c191ad8] [Name: .o1_mf_1_1093__1366653401581181_.arc] [Full Name: 192.168.10.1:/gv0/fleming1/db0/ALTUS_flash/archivelog/2013_04_22/.o1_mf_1_1093__1366653401581181_.arc] decode type as: unknown filehandle: 3a4f474c20117b487f884f169490a0349afacf71e16a95fc... new_attributes mode: value follows set_it: value follows (1) Mode: 0440, S_IRUSR, S_IRGRP .... .... .... .... .... 0... .... .... = S_ISUID: No .... .... .... .... .... .0.. .... .... = S_ISGID: No .... .... .... .... .... ..0. .... .... = S_ISVTX: No .... .... .... .... .... ...1 .... .... = S_IRUSR: Yes .... .... .... .... .... .... 0... .... = S_IWUSR: No .... .... .... .... .... .... .0.. .... = S_IXUSR: No .... .... .... .... .... .... ..1. .... = S_IRGRP: Yes .... .... .... .... .... .... ...0 .... = S_IWGRP: No .... .... .... .... .... .... .... 0... = S_IXGRP: No .... .... .... .... .... .... .... .0.. = S_IROTH: No .... .... .... .... .... .... .... ..0. = S_IWOTH: No .... .... .... .... .... .... .... ...0 = S_IXOTH: No uid: no value set_it: no value (0) gid: no value set_it: no value (0) size: value follows set_it: value follows (1) size: 476959744 atime: don't change set_it: don't change (0) mtime: don't change set_it: don't change (0) guard: no value check: no value (0) [2] Full Reply Ethernet II, Src: Ibm_36:f7:d0 (5c:f3:fc:36:f7:d0), Dst: IntelCor_38:e7:58 (00:1e:67:38:e7:58) Internet Protocol Version 4, Src: 192.168.10.1 (192.168.10.1), Dst: 192.168.10.3 (192.168.10.3) Transmission Control Protocol, Src Port: 38467 (38467), Dst Port: 46391 (46391), Seq: 1230671698, Ack: 2230824272, Len: 40 Remote Procedure Call, Type:Reply XID:0x8b534455 Fragment header: Last fragment, 36 bytes 1... .... .... .... .... .... .... .... = Last Fragment: Yes .000 0000 0000 0000 0000 0000 0010 0100 = Fragment Length: 36 XID: 0x8b534455 (2337490005) Message Type: Reply (1) [Program: NFS (100003)] [Program Version: 3] [Procedure: SETATTR (2)] Reply State: accepted (0) [This is a reply to a request in frame 293324] [Time from request: 0.001547000 seconds] Verifier Flavor: AUTH_NULL (0) Length: 0 Accept State: RPC executed successfully (0) Network File System, SETATTR Reply Error:NFS3ERR_ACCES [Program Version: 3] [V3 Procedure: SETATTR (2)] Status: NFS3ERR_ACCES (13) obj_wcc before attributes_follow: no value (0) after attributes_follow: no value (0) -- Michael Brown | `One of the main causes of the fall of Systems Consultant | the Roman Empire was that, lacking zero, Net Direct Inc. | they had no way to indicate successful ☎: +1 519 883 1172 x5106 | termination of their C programs.' - Firth |