None of the copies were deleted from the backend (we even did a search
and verify) the strange thing is this doesn't occur on patch-634. We are
also seeing the same thing on a a afr/stripe/unify (so I'm thinking afr
is the culprit). once again with the strip/afr/unify example it's only
some of the files,.
We did try to delete all but one copy directly from the backend servers
to repair some of the files but the error persisted.
Thanks!
-Mickey Mazarick
Vikas Gorur wrote:
2008/11/21 Mickey Mazarick <mic@xxxxxxxxxxxxxxxxxx>:
This only occurs on about 10% of the files on an afr/unify over ibverbs
mount
I'm seeing the following:
# cat /scripts/setupMachine.sh
cat: /scripts/setupMachine.sh: Input/output error
glusterfs log says:
2008-11-20 20:03:12 W [afr-self-heal-common.c:943:afr_self_heal] afr1:
performing self heal on /scripts/setupMachine.sh (metadata=0 data=1 entry=0)
2008-11-20 20:03:12 E [afr-self-heal-data.c:767:afr_sh_data_fix] afr1:
Unable to resolve conflicting data of /scripts/setupMachine.sh. Please
resolve manually by deleting the file /scripts/setupMachine.sh from all but
the preferred subvolume
2008-11-20 20:03:12 W [afr-self-heal-data.c:70:afr_sh_data_done] afr1: self
heal of /scripts/setupMachine.sh completed
2008-11-20 20:03:12 E [unify.c:928:unify_open_cbk] unify: Open success on
namespace, failed on child node
2008-11-20 20:03:12 E [fuse-bridge.c:605:fuse_fd_cbk] glusterfs-fuse: 297:
OPEN() /scripts/setupMachine.sh => -1 (Input/output error)
There are only 2 files on 2 of the 6 storage systems and they are look
identical. I have the same results if I delete one of them. Is there an easy
fix?
Did you delete one of the copies on the backend? If yes, then it looks
like self heal is being spuriously triggered. Thanks for reporting, I
will try to reproduce this behavior.
Vikas
--