Krishna,
Is that to say that it's a bug, or am I just using it wrong? Or do I just
have a knack for finding dodgy edge cases?
Is there a workaround?
I have just reconfigured my servers to do 2-process client-side AFR (i.e.
the traditional approach), and that works fine. But having single-process
server-side AFR would be more efficient, and simplify my config somewhat.
Thanks.
Gordan
On Tue, 20 May 2008, Krishna Srinivas wrote:
In this setup, home1 is sending CHILD_UP event to "server" xlator instead
of the "home" afr xlator. (and home2 is not up) This makes afr think none
of its subvols are up. We will fix it to handle this situation.
Thanks
Krishna
On Tue, May 20, 2008 at 2:00 PM, Gordan Bobic <gordan@xxxxxxxxxx> wrote:
This is with release 1.3.9.
Not much more that seems relevant turns up in the logs with -L DEBUG (DNS
chatter, mentions that the 2nd server isn't talking (glusterfs is switched
off on it because that causes the lock-up).
This gets logged when I try to cat ~/.bashrc:
2008-05-20 09:14:08 D [fuse-bridge.c:375:fuse_entry_cbk] glusterfs-fuse:
39: (34) /gordan/.bashrc =>
60166157
2008-05-20 09:14:08 D [inode.c:577:__create_inode] fuse/inode: create
inode(60166157)
2008-05-20 09:14:08 D [inode.c:367:__active_inode] fuse/inode: activating
inode(60166157), lru=7/102
4
2008-05-20 09:14:08 D [inode.c:367:__active_inode] fuse/inode: activating
inode(60166157), lru=7/102
4
2008-05-20 09:14:08 D [fuse-bridge.c:1517:fuse_open] glusterfs-fuse: 40:
OPEN /gordan/.bashrc
2008-05-20 09:14:08 E [afr.c:1985:afr_selfheal] home: none of the children
are up for locking, retur
ning EIO
2008-05-20 09:14:08 E [fuse-bridge.c:692:fuse_fd_cbk] glusterfs-fuse: 40:
(12) /gordan/.bashrc => -1
(5)
On the command line, I get back "Input/output error". I can ls the files,
but cannot actually read them.
This is with only the first server up. Same happens when I mount home.vol
via fstab or via something like:
glusterfs -f /etc/glusterfs/home.vol /home
I have also reduced the config (single process, intended for servers) to a
bare minimum (removed posix lock layer), to get to the bottom of it, but I
cannot get any reads to work:
volume home1
type storage/posix
option directory /gluster/home
end-volume
volume home2
type protocol/client
option transport-type tcp/client
option remote-host 192.168.3.1
option remote-subvolume home2
end-volume
volume home
type cluster/afr
option read-subvolume home1
subvolumes home1 home2
end-volume
volume server
type protocol/server
option transport-type tcp/server
subvolumes home home1
option auth.ip.home.allow 127.0.0.1,192.168.*
option auth.ip.home1.allow 127.0.0.1,192.168.*
end-volume
On a related node, if single-process is used, how does GlusterFS know which
volume to mount? For example, if it is trying to mount the protocol/client
volume (home2), the obviously, that won't work because the 2nd server is
not up. If it is mounting the protocol/server volume, then is it trying to
mount home or home1? Or does it mount the outermost volume that _isn't_ a
protocol/[client|server] (which is "home" in this case)?
Thanks.
Gordan
On Tue, 20 May 2008 13:18:07 +0530, Krishna Srinivas
<krishna@xxxxxxxxxxxxx> wrote:
> Gordan,
>
> Which patch set is this? Can you run glusterfs server side with "-L
DEBUG"
> and send the logs?
>
> Thanks
> Krishna
>
> On Tue, May 20, 2008 at 1:56 AM, Gordan Bobic <gordan@xxxxxxxxxx> wrote:
>> Hi,
>>
>> I'm having rather major problems getting single-process AFR to work
> between
>> two servers. When both servers come up, the GlusterFS on both locks up
>> pretty solid. The processes that try to access the FS (including ls)
> seem to
>> get nowhere for a few minutes, and then complete. But something gets
> stuck,
>> and glusterfs cannot be killed even with -9!
>>
>> Another worrying thing is that fuse kernel module ends up having a
> reference
>> count even after glusterfs process gets killed (sometimes killing the
> remote
>> process that isn't locked up on it's host can break the locked-up
> operations
>> and allow for the local glusterfs process to be killed). So fuse then
> cannot
>> be unloaded.
>>
>> This error seems to come up in the logs all the time:
>> 2008-05-19 20:57:17 E [afr.c:1985:afr_selfheal] home: none of the
> children
>> are up for locking, returning EIO
>> 2008-05-19 20:57:17 E [fuse-bridge.c:692:fuse_fd_cbk] glusterfs-fuse:
> 63:
>> (12) /test => -1 (5)
>>
>> This implies come kind of a locking issue, but the same error and
> conditions
>> also arise when posix locking module is removed.
>>
>> The configs for the two servers are attached. They are almost identical
> to
>> the examples on the glusterfs wiki:
>>
>> http://www.gluster.org/docs/index.php/AFR_single_process
>>
>> What am I doing wrong? Have I run into another bug?
>>
>> Gordan
>>
>> volume home1-store
>> type storage/posix
>> option directory /gluster/home
>> end-volume
>>
>> volume home1
>> type features/posix-locks
>> subvolumes home1-store
>> end-volume
>>
>> volume home2
>> type protocol/client
>> option transport-type tcp/client
>> option remote-host 192.168.3.1
>> option remote-subvolume home2
>> end-volume
>>
>> volume home
>> type cluster/afr
>> option read-subvolume home1
>> subvolumes home1 home2
>> end-volume
>>
>> volume server
>> type protocol/server
>> option transport-type tcp/server
>> subvolumes home home1
>> option auth.ip.home.allow 127.0.0.1
>> option auth.ip.home1.allow 192.168.*
>> end-volume
>>
>> volume home2-store
>> type storage/posix
>> option directory /gluster/home
>> end-volume
>>
>> volume home2
>> type features/posix-locks
>> subvolumes home2-store
>> end-volume
>>
>> volume home1
>> type protocol/client
>> option transport-type tcp/client
>> option remote-host 192.168.0.1
>> option remote-subvolume home1
>> end-volume
>>
>> volume home
>> type cluster/afr
>> option read-subvolume home2
>> subvolumes home1 home2
>> end-volume
>>
>> volume server
>> type protocol/server
>> option transport-type tcp/server
>> subvolumes home home2
>> option auth.ip.home.allow 127.0.0.1
>> option auth.ip.home2.allow 192.168.*
>> end-volume
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel@xxxxxxxxxx
>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>
>>
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel