Hi all, Having strange reports in the glusterfs client log files. My config is gluster 1.3.7/fuse2.7.0-glfs5, linux 2.6.16.55. We have 3 clients and 3 servers (1 client and 1 server on each host) on a 100Mb network with 5ms round trip between clients and servers. The 3 clients replicate with afr on client side over the 3 servers (cf the end of message for gluster stack details). I have a cron script that runs every hour and takes the write lock on file cluster-root/var/lock/fs-stress.lock located on gluster file system. The script runs fine and locking mechanism seems to works well, at least as expected. But I have strange error message I dont understand in glusterfs client logs: On the client (op1) that effectively takes the lock in that example, I have only 2 error messages: # on client (op1) 2007-11-05 13:15:32 E [afr.c:3245:afr_lk_cbk] tbs-clust-data-afr: (path=/cluster-root/var/lock/fs-stress.lock child=tbs-clust-op1-data) op_ret=-1 op_errno=107 2007-11-05 13:15:32 E [afr.c:3245:afr_lk_cbk] tbs-clust-data-afr: (path=/cluster-root/var/lock/fs-stress.lock child=tbs-clust-or3-data) op_ret=-1 op_errno=107 On the other clients, where there is a locking attempt that normally fails due the lock already placed by op1 on the file, I have 3 error messages: # on client (or2) 2007-11-05 13:15:37 E [afr.c:3245:afr_lk_cbk] tbs-clust-data-afr: (path=/cluster-root/var/lock/fs-stress.lock child=tbs-clust-or3-data) op_ret=-1 op_errno=107 2007-11-05 13:15:37 E [afr.c:3245:afr_lk_cbk] tbs-clust-data-afr: (path=/cluster-root/var/lock/fs-stress.lock child=tbs-clust-op1-data) op_ret=-1 op_errno=107 2007-11-05 13:15:37 E [afr.c:3245:afr_lk_cbk] tbs-clust-data-afr: (path=/cluster-root/var/lock/fs-stress.lock child=tbs-clust-or2-data) op_ret=-1 op_errno=107 # on client (or3) 2007-11-05 13:16:17 E [afr.c:3245:afr_lk_cbk] tbs-clust-data-afr: (path=/cluster-root/var/lock/fs-stress.lock child=tbs-clust-or2-data) op_ret=-1 op_errno=107 2007-11-05 13:16:17 E [afr.c:3245:afr_lk_cbk] tbs-clust-data-afr: (path=/cluster-root/var/lock/fs-stress.lock child=tbs-clust-op1-data) op_ret=-1 op_errno=107 2007-11-05 13:16:17 E [afr.c:3245:afr_lk_cbk] tbs-clust-data-afr: (path=/cluster-root/var/lock/fs-stress.lock child=tbs-clust-or3-data) op_ret=-1 op_errno=107 Note that on the client op1 that takes the lock, the error message refers to op1 and or3. Seams like the lock is taken on server or2 without problem ? According to the documentation, I have the lock translator just above posix/storage on server side. Should I have locking on client side ? Is locking translator appropriate place really on server side ? NB: tbs-clust-XXX-data are protocol/client bricks On servers I have his stack: storage/posix features/posix-locks performance/io-threads protocol/server On clients I have this stack: protocol/client(*3) cluster/afr performance/io-threads performance/io-cache performance/write-behind -- Vincent Régnard vregnard@xxxxxxxxxxxxxxxx TBS-internet.com 027 630 5902