It seems that the connection gets dropped (or not even able to establish). Is the ssh auth set up properly from the second volume? Csaba On Thu, Jun 30, 2011 at 4:22 PM, Adrian Carpenter <tac12 at wbic.cam.ac.uk> wrote: > Hi Csaba, > > I'm now seeing consistent errors with a second volume: > > [2011-06-30 06:08:48.299174] I [monitor(monitor):19:set_state] Monitor: new state: OK > [2011-06-30 09:27:46.875745] E [syncdutils:131:exception] <top>: FAIL: > Traceback (most recent call last): > ?File "/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 152, in twrap > ? ?tf(*aa) > ?File "/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/repce.py", line 118, in listen > ? ?rid, exc, res = recv(self.inf) > ?File "/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/repce.py", line 42, in recv > ? ?return pickle.load(inf) > EOFError > [2011-06-30 09:27:58.413588] I [monitor(monitor):42:monitor] Monitor: ------------------------------------------------------------ > [2011-06-30 09:27:58.413830] I [monitor(monitor):43:monitor] Monitor: starting gsyncd worker > [2011-06-30 09:27:58.479687] I [gsyncd:286:main_i] <top>: syncing: gluster://localhost:user-volume -> file:///geo-tank/user-volume > [2011-06-30 09:28:03.963303] I [master:181:crawl] GMaster: new master is a747062e-1caa-4cb3-9f86-34d03486a842 > [2011-06-30 09:28:03.963587] I [master:187:crawl] GMaster: primary master with volume id a747062e-1caa-4cb3-9f86-34d03486a842 ... > [2011-06-30 09:34:35.592005] E [syncdutils:131:exception] <top>: FAIL: > Traceback (most recent call last): > ?File "/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 152, in twrap > ? ?tf(*aa) > ?File "/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/repce.py", line 118, in listen > ? ?rid, exc, res = recv(self.inf) > ?File "/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/repce.py", line 42, in recv > ? ?return pickle.load(inf) > EOFError > [2011-06-30 09:34:45.595258] I [monitor(monitor):42:monitor] Monitor: ------------------------------------------------------------ > [2011-06-30 09:34:45.595668] I [monitor(monitor):43:monitor] Monitor: starting gsyncd worker > [2011-06-30 09:34:45.661334] I [gsyncd:286:main_i] <top>: syncing: gluster://localhost:user-volume -> file:///geo-tank/user-volume > [2011-06-30 09:34:51.145607] I [master:181:crawl] GMaster: new master is a747062e-1caa-4cb3-9f86-34d03486a842 > [2011-06-30 09:34:51.145898] I [master:187:crawl] GMaster: primary master with volume id a747062e-1caa-4cb3-9f86-34d03486a842 ... > [2011-06-30 12:35:54.394453] E [syncdutils:131:exception] <top>: FAIL: > Traceback (most recent call last): > ?File "/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 152, in twrap > ? ?tf(*aa) > ?File "/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/repce.py", line 118, in listen > ? ?rid, exc, res = recv(self.inf) > ?File "/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/repce.py", line 42, in recv > ? ?return pickle.load(inf) > UnpicklingError: invalid load key, '???'. > [2011-06-30 12:36:05.839510] I [monitor(monitor):42:monitor] Monitor: ------------------------------------------------------------ > [2011-06-30 12:36:05.839916] I [monitor(monitor):43:monitor] Monitor: starting gsyncd worker > [2011-06-30 12:36:05.905232] I [gsyncd:286:main_i] <top>: syncing: gluster://localhost:user-volume -> file:///geo-tank/user-volume > [2011-06-30 12:36:11.413764] I [master:181:crawl] GMaster: new master is a747062e-1caa-4cb3-9f86-34d03486a842 > [2011-06-30 12:36:11.414047] I [master:187:crawl] GMaster: primary master with volume id a747062e-1caa-4cb3-9f86-34d03486a842 ... > > > Adrian > On 28 Jun 2011, at 11:16, Csaba Henk wrote: > >> Hi Adrian, >> >> >> On Tue, Jun 28, 2011 at 12:04 PM, Adrian Carpenter <tac12 at wbic.cam.ac.uk> wrote: >>> Thanks Csaba, >>> >>> So far as I am aware nothing tampered with the xattrs, ?and all the bricks etc are time synchronised. ?Anyway I did as you suggest, ?now for one volume ?(I have three being geo-rep'd) I consistently get this: >>> >>> OSError: [Errno 12] Cannot allocate memory >> >> do you get this consistently, or randomly-but-recurring, or spotted >> once/a few times then gone? >> >>> File "/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/libcxattr.py", line 26, in _query_xattr >> ?cls.raise_oserr() >>> File "/opt/glusterfs/3.2.1/local/libexec/glusterfs/python/syncdaemon/libcxattr.py", line 16, in raise_oserr >> ?raise OSError(errn, os.strerror(errn)) >>> OSError: [Errno 12] Cannot allocate memory >> >> If seen more than once, how much does the stack trace vary? Exactly >> the same, or not exactly but crashes in the same function (just on a >> different code path), or not exactly but at least in libcxattr module, >> or quite different? >> >> What python version do you use? If you use python 2.4.*, with external >> ctypes, then what source you've taken ctypes from, what version? >> >> Thanks, >> Csaba >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > >