(Sorry if this comes through twice, but I sent the original almost 12 hours ago and it hasn't appeared in the archives even though another mail sent after mine has) Hi, I've only been using glusterfs for a couple of weeks, but I've been having a few issues with it. For one of the issues, I've managed to put together steps to reproduce so I guess this is a bug report. The log files on the client that experiences the error: [2011-11-19 18:05:23.619352] W [afr-common.c:1121:afr_conflicting_iattrs] 0-testvol-replicate-0: /testfile: gfid differs on subvolume 1 (3089007a-da1c-41ad-a111-d1a988de2420, 50eb7bf4-0516-4508-808c-909ac0f968f6) [2011-11-19 18:05:23.619391] W [afr-common.c:1121:afr_conflicting_iattrs] 0-testvol-replicate-0: /testfile: gfid differs on subvolume 1 (3089007a-da1c-41ad-a111-d1a988de2420, 50eb7bf4-0516-4508-808c-909ac0f968f6) [2011-11-19 18:05:23.619413] W [afr-common.c:882:afr_detect_self_heal_by_iatt] 0-testvol-replicate-0: /testfile: gfid different on subvolume [2011-11-19 18:05:23.619452] I [afr-common.c:1038:afr_launch_self_heal] 0-testvol-replicate-0: background missing-entry self-heal triggered. path: /testfile [2011-11-19 18:05:23.624027] I [afr-self-heal-common.c:1858:afr_sh_post_nb_entrylk_conflicting_sh_cbk] 0-testvol-replicate-0: Non blocking entrylks failed. [2011-11-19 18:05:23.624062] I [afr-self-heal-common.c:963:afr_sh_missing_entries_done] 0-testvol-replicate-0: split brain found, aborting selfheal of /testfile [2011-11-19 18:05:23.624084] E [afr-self-heal-common.c:2074:afr_self_heal_completion_cbk] 0-testvol-replicate-0: background missing-entry self-heal failed on /testfile [2011-11-19 18:05:23.624108] W [afr-common.c:1121:afr_conflicting_iattrs] 0-testvol-replicate-0: /testfile: gfid differs on subvolume 1 (3089007a-da1c-41ad-a111-d1a988de2420, 50eb7bf4-0516-4508-808c-909ac0f968f6) [2011-11-19 18:05:23.624133] W [fuse-bridge.c:184:fuse_entry_cbk] 0-glusterfs-fuse: 9142: LOOKUP() /testfile => -1 (Input/output error) And to reproduce, using two glusterfs (v3.2.5) servers with the following volume definition: Volume Name: testvol Type: Replicate Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: 10.104.123.145:/gluster/testvol Brick2: 10.82.37.136:/gluster/testvol Run this on one client: # while true; do touch testfile.tmp; mv testfile.tmp testfile; done And this script on another client: # while true; do x=$(<testfile); done I couldn't get the error to occur either when both scripts were run on a single client, or when using the glusterfs servers instead separate clients. Also, it didn't matter if both clients were mount from the same glusterfs server or one from each of the servers. My assumption is that the second client's read is being interleaved with the first client's move operation, giving a differing gfid. If any further information is needed, please don't hesitate to let me know. -- Jason Stubbs