Re: about afr

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nicolas,

I am not able to completely understand the scenario.
But consider this case, there are 2 servers afred onto a client:
* When both servers are up you open a file (don't close it)
* Bring one of the servers down, now the opened fd loses the context
of the downed server, i.e no further operations go there even if the
server comes back up,
* Now if the second server goes down, further operations on the fd
completely fails and application gets error.

This scenario is expected.

But under no scenario can glusterfs hang - which is what you are
saying is happening sometimes. You also say that glusterfs hangs
without any performance translators. When it hangs can you attach gdb
to glusterfs client and give us the backtrace?

>> gdb -p <pid of glusterfs>
>> type "bt" at the gdb command prompt.

You say that glusterfs hangs when only using qemu on glusterfs setup.
Does any other appication hang?

Can you explain the steps once again to reproduce the glusterfs hang?
clearly mention if server process is stopped or if the server machine
is hard powered off in the steps.

Thanks
Krishna

On Tue, Feb 3, 2009 at 8:45 PM, nicolas prochazka
<prochazka.nicolas@xxxxxxxxx> wrote:
> Without performance translator, the result is the same.
> I'm trying with gdb as soon as possible.
> you say, EBADFD is fine, AFR will try the operation on the other server , ok
> so i understand, but it I test to stop this server, gluster can not retrieve
> the first which is EBADFD.
> A lot of my problem comes from here, i think, because with my two server,
> i stop the first, then restart , wait, stop the second, restart  and all is
> KO.
> I just try to stop the first and test, then all is ok .
> Nicolas
>
> On Tue, Feb 3, 2009 at 3:50 PM, Krishna Srinivas <krishna@xxxxxxxxxxxxx>
> wrote:
>>
>> Nicolas,
>>
>> When you restart the server logs indicating EBADFD is fine, AFR will
>> try the operation on the other server. When you have the situation
>> where the glusterfs client hangs can you attach gdb to the glusterfs
>> and mail us the backtrace?
>>
>> gdb -p <pid of glusterfs>
>> type "bt" at the gdb command prompt.
>>
>> Just want to confirm that glusterfs has not blocked at a system call.
>> (as we have non blocking io now)
>>
>> Can you see if removing the performance translators helps? we can
>> narrow down to the problem translator in such case.
>>
>> Krishna
>>
>> On Tue, Feb 3, 2009 at 5:18 PM, nicolas prochazka
>> <prochazka.nicolas@xxxxxxxxx> wrote:
>> > ok,
>> > So now I know there's few bugs,
>> >
>> > 1 - when stop and i restart a server , I've the EBADFD bug
>> > 2 - When I stop server :
>> >       - with  --disable-direct-io-mode   : my big image file become
>> > corrupt
>> > ( missing data ...)
>> >       - without --disable-direct-io-mode  :   my process hangs and cpu
>> > load
>> > grows a lot (by process )
>> >
>> > any ideas ?
>> >
>> > Regards,
>> > Nicolas Prochazka
>> >
>> >  On Tue, Feb 3, 2009 at 5:42 AM, Raghavendra G
>> > <raghavendra@xxxxxxxxxxxxx>
>> > wrote:
>> >>
>> >> Hi Nicolas,
>> >>
>> >> On Tue, Feb 3, 2009 at 12:01 AM, nicolas prochazka
>> >> <prochazka.nicolas@xxxxxxxxx> wrote:
>> >>>
>> >>> I inspect the log and i find something interesting :
>> >>> All is ok,
>> >>> i have stop 10.98.98.2 and i restart it :
>> >>>
>> >>> 2009-02-02 15:00:32 D [client-protocol.c:6498:notify]
>> >>> brick_10.98.98.2:
>> >>> got GF_EVENT_CHILD_UP
>> >>> 2009-02-02 15:00:32 D [socket.c:924:socket_connect] brick_10.98.98.2:
>> >>> connect () called on transport already connected
>> >>> 2009-02-02 15:00:32 N [client-protocol.c:5786:client_setvolume_cbk]
>> >>> brick_10.98.98.2: connection and handshake succeeded
>> >>> 2009-02-02 15:00:40 D [fuse-bridge.c:1945:fuse_statfs] glusterfs-fuse:
>> >>> 17399: STATFS
>> >>> 2009-02-02 15:00:40 D [fuse-bridge.c:368:fuse_entry_cbk]
>> >>> glusterfs-fuse:
>
>




[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux