Re: Re: If I have 5 GNBD server?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Benjamin Marzinski wrote:

On Tue, Aug 30, 2005 at 08:41:12AM +0700, Fajar A. Nugraha wrote:
Benjamin Marzinski wrote:

If the gnbds are exported uncached (the default), the client will fail back IO if it can no longer talk to the server after a specified timeout.
What is the default timeout anyway, and how can I set it?
Last time I test gnbd-import timeout was on a development version (DEVEL.1104982050) and after more than 30 minutes, the client still tries to reconnect.

The default timeout is 1 minute. It is tuneable with the -t option (see the
gnbd man page). However you only timeout if you export the device in uncached
mode.

I find something interesting :
gnbd_import man page (no mention of timeout):
      -t server
             Fence from Server.
Specify a server for the IO fence (only used with the -s option).

gnbd_export man page :
      -t [seconds]
             Timeout.
Set the exported GNBD to timeout mode This option is used with -p.
             This  is  the  default  for uncached  GNBDs

Isn't the client the one that has to determine whether it's in wait mode or timeout mode? How does the parameter from gnbd_export passed to gnbd_import?

I tested it today with gnbd 1.00.00, by adding an extra ip address to the server -> gnbd_export on the server (IP address 192.168.17.193, cluster member, no extra parameter, so it should be exported as uncached gnbd in timeout mode) -> gnbd_import on the client (member of a different cluster) -> mount the gnbd_import -> remove the IP addresss 192.168.17.193 from the server -> do df -k on the client, and I got these on the client's syslog

Aug 31 09:55:58 node1 gnbd_recvd[9792]: client lost connection with 192.168.17.193 : Interrupted system call
Aug 31 09:55:58 node1 gnbd_recvd[9792]: reconnecting
Aug 31 09:55:58 node1 kernel: gnbd (pid 9792: gnbd_recvd) got signal 1
Aug 31 09:55:58 node1 kernel: gnbd2: Receive control failed (result -4)
Aug 31 09:55:58 node1 kernel: gnbd2: shutting down socket
Aug 31 09:55:58 node1 kernel: exitting GNBD_DO_IT ioctl
Aug 31 09:56:03 node1 gnbd_monitor[9781]: ERROR [gnbd_monitor.c:486] server Dè¯ is not a cluster member, cannot fence. Aug 31 09:56:08 node1 gnbd_monitor[9781]: ERROR [gnbd_monitor.c:486] server Dè¯ is not a cluster member, cannot fence. Aug 31 09:56:08 node1 gnbd_recvd[9792]: ERROR [gnbd_recvd.c:213] cannot connect to server 192.168.17.193 (-1) : Interrupted system call
Aug 31 09:56:08 node1 gnbd_recvd[9792]: reconnecting
Aug 31 09:56:13 node1 gnbd_monitor[9781]: ERROR [gnbd_monitor.c:486] server Dè¯ is not a cluster member, cannot fence. Aug 31 09:56:13 node1 gnbd_recvd[9792]: ERROR [gnbd_recvd.c:213] cannot connect to server 192.168.17.193 (-1) : Interrupted system call
Aug 31 09:56:13 node1 gnbd_recvd[9792]: reconnecting

And it goes on, and on, and on :) After ten minutes, I add the IP address back to the server and these appear on syslog :
Aug 31 10:06:13 node1 gnbd_recvd[9792]: reconnecting
Aug 31 10:06:16 node1 kernel: resending requests

So it looks like by default gnbd runs in wait mode, and after it reconnects the kernel automatically resends the request without the need of dm-multipath.

Is my setup incorrect, or is this how it's supposed to work?

Regards,

Fajar

--

Linux-cluster@xxxxxxxxxx
http://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux