James, It is planned for the later releases of 1.4. Let us wait for Avati's reply regarding the timeframe. Krishna On Thu, Aug 28, 2008 at 7:03 PM, James E Warner <jwarner6@xxxxxxx> wrote: > Thanks for the prompt reply. One final question.... is the HA translator > still planned for the upcoming 1.4 release and if not do you have a rough > idea of what release it is going into? > > Thanks Again, > > James Warner > Computer Sciences Corporation > Registered Office: 3170 Fairview Park Drive, Falls Church, Virginia 22042, > USA > Registered in Nevada, USA No: C-489-59 > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > This is a PRIVATE message. If you are not the intended recipient, please > delete without copying and kindly advise us by e-mail of the mistake in > delivery. > NOTE: Regardless of content, this e-mail shall not operate to bind CSC to > any order or other contract unless pursuant to explicit written agreement > or government initiative expressly permitting the use of e-mail for such > purpose. > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > > > > > "Krishna > Srinivas" > <krishna@zresearc To > h.com> James E Warner/DEF/CSC@CSC > Sent by: cc > krishna.srinivas@ gluster-devel@xxxxxxxxxx > gmail.com Subject > Re: Server Side AFR > gets transport endpoint is not > 08/28/2008 01:03 connected > AM > > > > > > > > > > On Thu, Aug 28, 2008 at 12:45 AM, James E Warner <jwarner6@xxxxxxx> wrote: >> >> Hi, >> >> I'm currently testing gluster to see if I can make it work for our HA >> filesystem needs. And in initial testing things seem to be very good >> especially with client side AFR performing replication to our server > nodes. >> However, we would like to keep our client network free of replication >> traffic so I set up server side afr with three storage bricks replicating >> data between themselves and round robin DNS for the node failover. The >> round robin dns is working and the failover between the nodes is kind of >> working, but if I pull the network cable on the currently active server >> (the host that the glusterfs client is connected to) the next filesystem >> operation (such as ls /mnt/glusterfs) fails with a "transport endpoint is >> not connected" error. Similarly, if I have a large copy operation in >> progress the copy will exit with a failure. All of the operations after >> that work fine and netstat shows that the node has failed over to the > next >> server in the list, but by that point I the current file system operation >> has failed. Anyway, this leads me to a few questions: >> >> 0. Do my config files look OK or does it look like I've configured this >> thing incorrectly? :) >> 1. Is this the expected behavior or is this a bug? From reading the >> mailing list I had the impression that on failure the operation would be >> tried on the remaining ip's that were cached in the clients list, so I > was >> surprised that the operation failed and I think that it is probably a > bug, >> but I could see an argument for how this might be considered normal >> operation. > > That is the expected behavior. > >> >> 2. If this is expected behavior is there any plan to change the behavior >> in the future or is server side AFR always expected to work this way? > I've >> seen references to round robin dns being an interim measure on the > mailing >> list, so I'm not sure if there is another translator in the works or not. >> If there is something in the works is that available in the current >> glusterfs 1.4 snapshot releases or is that planned for a much later >> version? > > Yes we plan to bring in a HA translator which will make this work fine. > >> >> 3. Can you think of any option that I might have missed that would > correct >> the problem and allow the currently running file operation to succeed >> during a failover? >> >> 4. Once again if this is as designed can you explain the reason that it >> works this way? As I said I really expected it to transparently failover >> in much the same way that client side afr seems to, so I was surprised > that >> it didn't. > > If AFR is on client side, it will maintain connections to its > subvolumes separately. > So if one node fails, it will still have connection to other subvols. > However if AFR > is on server side and the server goes down, it can not do anything about > it. > Now if we bring HA xlator into picture, it sits on the client and it > can take care > of seamless failure transition when the connection fails. > >> >> Since I hope that this is a bug, the configuration files and the relevant >> sections of the client log are below. I have used this configuration on >> the gluster 1.3.11 version and the latest snapshot from August 27, 2008. >> >> Client Log Snippet: >> ================ >> >> 2008-08-27 12:53:34 D [fuse-bridge.c:839:fuse_err_cbk] glusterfs-fuse: > 62: > > > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxx > http://lists.nongnu.org/mailman/listinfo/gluster-devel >