Snezhana, One of the servers went down during selfheal, AFR just returns EIO whenever it encounters any errors during selfheal to keep it simple. Continuing with open() when selfheal fails is risky, hence it is better to leave it to the admin to look into it and fix it. Thanks Krishna On Tue, Jun 3, 2008 at 5:25 PM, Snezhana Bekova <dudo@xxxxxxx> wrote: > > > I have posted two client logs (on different glusterfs clients) information > on pastebin: > http://gluster.pastebin.org/40271[1] > http://gluster.pastebin.org/40272[2] > > and one server log information: > http://gluster.pastebin.org/40274[3] > > Thanks, > Snezhana > > Цитат от Anand Avati <avati@xxxxxxxxxxxxx>: > >> Can you send more of the client log? the part you have sent is not >> sufficient. >> >> avati >> >> 2008/5/27 Snezhana Bekova <dudo@xxxxxxx>: >> >>> >>> >>> Hello, >>> I'm running glusterfs--mainline--2.5--patch-770 (glusterfs 1.3.9) and >>> fuse-2.7.3glfs10 on 3 test machines with AFR with client side >>> replication. >>> My test setup is: 2 glusterfs servers and 3 glusterfs clients .The one >>> machine is configured as client only and two are configured as server and >>> client. While running tests with bonnie++ on the three clients (3 bonnies >>> in >>> parallel) I kill the one glusterfs server process and start again. After >>> two >>> or three test cycles one of the bonnies didn't finish at all with one of >>> the >>> errors: >>> "Can't open file ./Bonnie.17791.001" >>> "Can't open file 0000267hWNIpRcI" >>> "Expected 1024 files but only got 1039" >>> >>> The glusterfs client log messages are: >>> 2008-05-27 15:44:28 E [afr.c:1239:afr_selfheal_setxattr_cbk] afr: >>> (path=/Bonnie.1918/00002/0000267hWNIpRcI child=client-wks12) op_ret=-1 >>> op_errno=107 >>> 2008-05-27 15:44:28 W [client-protocol.c:332:client_protocol_xfer] >>> client-wks1: not connected at the moment to submit frame type(1) op(31) >>> 2008-05-27 15:44:28 E [client-protocol.c:2731:client_utimens_cbk] >>> client-wks1: no proper reply from server, returning ENOTCONN >>> 2008-05-27 15:44:28 E [afr.c:1279:afr_selfheal_utimens_cbk] afr: >>> (path=/Bonnie.1918/00002/0000267hWNIpRcI child=client-wks1) op_ret=-1 >>> op_errno=107 >>> 2008-05-27 15:44:28 W [client-protocol.c:332:client_protocol_xfer] >>> client-wks1: not connected at the moment to submit frame type(2) op(6) >>> 2008-05-27 15:44:28 E [client-protocol.c:4270:client_unlock_cbk] >>> client-wks1: no proper reply from server, returning ENOTCONN >>> 2008-05-27 15:44:28 E [afr.c:1140:afr_selfheal_unlock_cbk] afr: >>> (path=/Bonnie.1918/00002/0000267hWNIpRcI child=client-wks1) op_ret=-1 >>> op_errno=107 >>> 2008-05-27 15:44:28 E [afr.c:2177:afr_open] afr: self heal failed, >>> returning EIO >>> 2008-05-27 15:44:28 E [fuse-bridge.c:692:fuse_fd_cbk] glusterfs-fuse: >>> 113365: (12) /Bonnie.1918/00002/0000267hWNIpRcI => -1 (5) >>> >>> My config files on the two client and server mashines are: >>> cat /etc/glusterfs/glusterfs-server.vol >>> volume brick-local >>> type storage/posix >>> option directory /test >>> end-volume >>> >>> volume p-locks >>> type features/posix-locks >>> subvolumes brick-local >>> option mandatory on >>> end-volume >>> >>> volume server >>> type protocol/server >>> option transport-type tcp/server >>> subvolumes p-locks >>> option auth.ip.p-locks.allow * >>> end-volume >>> >>> cat /etc/glusterfs/glusterfs-client.vol >>> volume client-wks2 >>> type protocol/client >>> option transport-type tcp/client >>> option remote-host 10.0.0.2 >>> option remote-subvolume p-locks >>> option transport-timeout 5 >>> end-volume >>> >>> volume client-wks1 >>> type protocol/client >>> option transport-type tcp/client >>> option remote-host 10.0.0.1 >>> option remote-subvolume p-locks >>> option transport-timeout 5 >>> end-volume >>> >>> volume afr >>> type cluster/afr >>> subvolumes client-wks2 client-wks1 >>> end-volume >>> >>> And this is config file on the third only glusterfs client mashine: >>> /etc/glusterfs/glusterfs-client.vol >>> volume client-wks1 >>> type protocol/client >>> option transport-type tcp/client >>> option remote-host 10.0.0.1 >>> option remote-subvolume p-locks >>> option transport-timeout 5 >>> end-volume >>> >>> volume client-wks2 >>> type protocol/client >>> option transport-type tcp/client >>> option remote-host 10.0.0.2 >>> option remote-subvolume p-lock >>> option transport-timeout 5 >>> end-volume >>> >>> volume afr >>> type cluster/afr >>> subvolumes client-wks1 client-wks2 >>> end-volume >>> >>> Can you tell me what is wrong? Please advise and thanks in advance! >>> >>> Thanks, >>> Snezhana >>> >>> ------------------------------------- >>> >>> Хостинг от 2.60 лв/м | Домейни от 6.90 лв. | Сървъри, VPS от 42.00 лв/м с >>> ДДС >>> 12 GB място, Неограничен трафик, Безплатен домейн ? 5.70 лв./м с ДДС! >>> 17 GB място, 700 GB трафик, Безплатен домейн ? 11.46 лв./м с ДДС! >>> http://icn.bg/[4] >>> _______________________________________________ >>> Gluster-devel mailing list >>> Gluster-devel@xxxxxxxxxx >>> http://lists.nongnu.org/mailman/listinfo/gluster-devel[5] >>> >> >> >> >> -- >> If I traveled to the end of the rainbow >> As Dame Fortune did intend, >> Murphy would be there to tell me >> The pot's at the other end. > > > Links: > ------ > [1] http://gluster.pastebin.org/40271 > [2] http://gluster.pastebin.org/40272 > [3] http://gluster.pastebin.org/40274 > [4] http://icn.bg/ > [5] http://lists.nongnu.org/mailman/listinfo/gluster-devel > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxx > http://lists.nongnu.org/mailman/listinfo/gluster-devel >