Re: Re: Timeout settings and self-healing ? (WAS: HA failover test unsuccessful (inaccessible mountpoint))

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Guido,
Can you paste the server and client spec files again?
(it has got deleted from the pastebin)
Make sure you are using unify on client side and have set transport-timeout
to 10 secs.
If possible try to reproduce the problem you are seeing with minimal
spec file.
Thanks
Krishna

On Sat, Apr 26, 2008 at 4:36 AM, Amar S. Tumballi <amar@xxxxxxxxxxxxx> wrote:
>
>
> On Wed, Apr 23, 2008 at 3:47 AM, Guido Smit <guido@xxxxxxxxx> wrote:
> > Krishna,
> >
> > I did the test. I killed glusterfsd on one server.
> > All tests (ls, df, cp) worked like it should. I didn't even notice any
> difference. Unplugging the cable however, blocked all operations and finally
> after a few minutes
> > the transport endpoint message appears.
> >
> >
> >
> >
> The problem with TCP/IP is that when you unplug the cable, there is no
> messages sent to application's poll() on network. Driver internally tries to
> reconnect, and only after a long time. (it was around 10+minutes when we
> tested) we get message saying no route to host. But when applications die on
> server, or there is a shutdown, the connected nodes get a notification,
> hence everything will be smooth. Hence the delay in case of network cable
> unplugging.
>
> We came with an work around for managing this delay, that was
> 'transport-timeout' option, which times out each request after certain time.
> The default is '108's now. We kept it as high as this considering few
> applications which use mandatory locks, (block the write till a lock gets
> freed) can take easily up to 1+minutes for releasing the locks. Users have
> the option to set 'transport-timeout' (In client/protocol volume). So, they
> can tune it considering the I/O time of their apps.
>
> In our test setups, we could timeout exactly after given transport-timeout
> setting, everytime. So, the issue of freezing indefinitely, we couldn't
> reproduce.
>
>
> Regards,
> Amar
>
>
>
> --
> Amar Tumballi
>  Gluster/GlusterFS Hacker
> [bulde on #gluster/irc.gnu.org]
> http://www.zresearch.com - Commoditizing Super Storage!




[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux