Hi,
We have a glusterfs clusters, version is 3.2.7. The volume info is as below:
Volume Name: gfs1
Type: Distributed-Replicate
Status: Started
Number of Bricks: 94 x 3 = 282
Transport-type: tcp
We native mount the volume in all cluster servers. When we access the file “/XMTEXT/gfs1_000/000/000/095” on one server, the error is split brain.
While we can access the same file on another server.
At the same time, after re-mount the volume at error server, access the same file is ok.
The glusterfs has cached some information? This case has happened more than one.
The log is as following when split brain.
[2013-01-07 09:57:29.554505] W [afr-common.c:931:afr_detect_self_heal_by_lookup_status] 0-gfs1-replicate-5: split brain detected during lookup of /XMTEXT/gfs1_000/000/000/095.
[2013-01-07 09:57:29.554566] I [afr-common.c:1039:afr_launch_self_heal] 0-gfs1-replicate-5: background data gfid self-heal triggered. path: /XMTEXT/gfs1_000/000/000/095
[2013-01-07 09:57:29.555299] I [afr-self-heal-common.c:1290:sh_missing_entries_create] 0-gfs1-replicate-5: no missing files - /XMTEXT/gfs1_000/000/000/095. proceeding to metadata check
[2013-01-07 09:57:29.555507] I [afr-self-heal-common.c:1050:afr_sh_missing_entries_done] 0-gfs1-replicate-5: split brain found, aborting selfheal of /XMTEXT/gfs1_000/000/000/095
[2013-01-07 09:57:29.555531] E [afr-self-heal-common.c:2190:afr_self_heal_completion_cbk] 0-gfs1-replicate-5: background data gfid self-heal failed on /XMTEXT/gfs1_000/000/000/095
[2013-01-07 09:57:35.598229] W [afr-common.c:931:afr_detect_self_heal_by_lookup_status] 0-gfs1-replicate-5: split brain detected during lookup of /XMTEXT/gfs1_000/000/000/095.
[2013-01-07 09:57:35.598282] I [afr-common.c:1039:afr_launch_self_heal] 0-gfs1-replicate-5: background data gfid self-heal triggered. path: /XMTEXT/gfs1_000/000/000/095
[2013-01-07 09:57:35.598939] I [afr-self-heal-common.c:1290:sh_missing_entries_create] 0-gfs1-replicate-5: no missing files - /XMTEXT/gfs1_000/000/000/095. proceeding to metadata check
[2013-01-07 09:57:35.599139] I [afr-self-heal-common.c:1050:afr_sh_missing_entries_done] 0-gfs1-replicate-5: split brain found, aborting selfheal of /XMTEXT/gfs1_000/000/000/095
[2013-01-07 09:57:35.599176] E [afr-self-heal-common.c:2190:afr_self_heal_completion_cbk] 0-gfs1-replicate-5: background data gfid self-heal failed on /XMTEXT/gfs1_000/000/000/095
[2013-01-07 09:57:38.192819] W [afr-common.c:931:afr_detect_self_heal_by_lookup_status] 0-gfs1-replicate-5: split brain detected during lookup of /XMTEXT/gfs1_000/000/000/095.
[2013-01-07 09:57:38.192875] I [afr-common.c:1039:afr_launch_self_heal] 0-gfs1-replicate-5: background data gfid self-heal triggered. path: /XMTEXT/gfs1_000/000/000/095
[2013-01-07 09:57:38.193486] I [afr-self-heal-common.c:1290:sh_missing_entries_create] 0-gfs1-replicate-5: no missing files - /XMTEXT/gfs1_000/000/000/095. proceeding to metadata check
[2013-01-07 09:57:38.193708] I [afr-self-heal-common.c:1050:afr_sh_missing_entries_done] 0-gfs1-replicate-5: split brain found, aborting selfheal of /XMTEXT/gfs1_000/000/000/095
[2013-01-07 09:57:38.193731] E [afr-self-heal-common.c:2190:afr_self_heal_completion_cbk] 0-gfs1-replicate-5: background data gfid self-heal failed on /XMTEXT/gfs1_000/000/000/095
[2013-01-07 09:57:38.193937] W [afr-open.c:168:afr_open] 0-gfs1-replicate-5: failed to open as split brain seen, returning EIO
[2013-01-07 09:57:38.194033] W [fuse-bridge.c:693:fuse_fd_cbk] 0-glusterfs-fuse: 3162527: OPEN() /XMTEXT/gfs1_000/000/000/095 => -1 (Input/output error)
[2013-01-07 10:08:12.569821] W [afr-common.c:931:afr_detect_self_heal_by_lookup_status] 0-gfs1-replicate-5: split brain detected during lookup of /XMTEXT/gfs1_000/000/000/095.
[2013-01-07 10:08:12.569891] I [afr-common.c:1039:afr_launch_self_heal] 0-gfs1-replicate-5: background data gfid self-heal triggered. path: /XMTEXT/gfs1_000/000/000/095
[2013-01-07 10:08:12.571538] I [afr-self-heal-common.c:1290:sh_missing_entries_create] 0-gfs1-replicate-5: no missing files - /XMTEXT/gfs1_000/000/000/095. proceeding to metadata check
[2013-01-07 10:08:12.572684] I [afr-self-heal-common.c:1050:afr_sh_missing_entries_done] 0-gfs1-replicate-5: split brain found, aborting selfheal of /XMTEXT/gfs1_000/000/000/095
[2013-01-07 10:08:12.572732] E [afr-self-heal-common.c:2190:afr_self_heal_completion_cbk] 0-gfs1-replicate-5: background data gfid self-heal failed on /XMTEXT/gfs1_000/000/000/095
[2013-01-07 10:08:12.580006] W [afr-open.c:168:afr_open] 0-gfs1-replicate-5: failed to open as split brain seen, returning EIO
[2013-01-07 10:08:12.580103] W [fuse-bridge.c:693:fuse_fd_cbk] 0-glusterfs-fuse: 3164490: OPEN() /XMTEXT/gfs1_000/000/000/095 => -1 (Input/output error)
Thanks!
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
https://lists.nongnu.org/mailman/listinfo/gluster-devel