Re: Two annoying bugs in 3.2.5

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/17/2011 12:40 AM, Emmanuel Dreyfus wrote:
Pranith Kumar K<pranithk@xxxxxxxxxxx>  wrote:

2) When using AFR, if a peer goes down, processes that have I/O pending
will se an error. Just retrying the same operation is fine, but that is
a bit furstrating.

Emmanuel,
      Could you give the test case for the afr issue.
It is quite bold, I have not yet narrowed it down to something simple.
My test case is building NetBSD. You can grab the tarballs here:
http://ftp.fr.netbsd.org/pub/NetBSD/NetBSD-5.1/source/sets/src.tgz
http://ftp.fr.netbsd.org/pub/NetBSD/NetBSD-5.1/source/sets/sharesrc.tge
http://ftp.fr.netbsd.org/pub/NetBSD/NetBSD-5.1/source/sets/syssrc.tgz
http://ftp.fr.netbsd.org/pub/NetBSD/NetBSD-5.1/source/sets/gnusrc.tgz

Unpack, then cd usr/src&&  ./build.sh -U release
I wait for the build to actually start, then pkill glusterfsd on a
replica, and the build stops because of an I/O error.

Emmanuel,
I think I will get a chance to take a closer look into this issue next week if you are willing to wait. I am suspecting the following code. Could you make the following change and check if this is the issue you are hitting. could you send me the logs of the client, on the next run.

         struct iatt     *buf = NULL;
         struct iatt     *postparent = NULL;
         dict_t          **xattr = NULL;
+        afr_private_t   *priv = NULL;

         GF_ASSERT (local);
+        priv = this->private;

         buf = &local->cont.lookup.buf;
         postparent = &local->cont.lookup.postparent;
@@ -787,6 +789,9 @@ afr_lookup_build_response_params (afr_local_t *local, xlator_t *this)
         read_child = afr_read_child (this, local->cont.lookup.inode);
gf_log (this->name, GF_LOG_DEBUG, "Building lookup response from %d",
                 read_child);
+        GF_ASSERT (afr_is_child_present (local->cont.lookup.child_success,
+                                         priv->child_count, read_child));
+        GF_ASSERT (local->cont.lookup.sources[read_child]);
         //honor the xattr set by data-self-heal
         if (!*xattr)
                 *xattr = dict_ref (local->cont.lookup.xattrs[read_child]);

Pranith



[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux