Re: Bitrot stub forget()

FNU Raghavendra Manjunath <rabhat@xxxxxxxxxx> · Tue, 16 Feb 2016 23:38:24 -0500

Venky,
Yes. You are right. We should not remove the quarantine entry in forget. 

We have to remove it upon getting -ve lookups in bit-rot-stub and upon getting an unlink.

I have attached a patch for it. 

Unfortunately rfc.sh is failing for me with the below error.


ssh: connect to host git.gluster.com port 22: Connection timed out
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists."


Regards,
Raghavendra


On Tue, Feb 16, 2016 at 10:53 AM, Venky Shankar <vshankar@xxxxxxxxxx> wrote:
Hey Raghu,



Bitrot stub inode forget implementation (br_stub_forget()) deletes the bad object

marker (under quarantine directory) if present. This looks incorrect as ->forget()

can be trigerred when inode table LRU size exceeeds configured limit - check bug

#1308961 which tracks this issue. I recall that protocol/server calls inode_forget()

on negative lookup (that might not invoke ->forget() though) and that's the reason

why br_stub_forget() has this code.



So, would it make sense to purge bad object marker just in lookup()? There might be

a need to do the same in unlink() in case the object was removed by the client.



Thoughts?



Thanks,



                Venky



From a0cc49172df24e263e0db25c53b57f58c19d2cab Mon Sep 17 00:00:00 2001
From: Raghavendra Bhat <raghavendra@xxxxxxxxxx>
Date: Tue, 16 Feb 2016 20:22:36 -0500
Subject: [PATCH] features/bitrot: do not remove the quarantine handle in
 forget

If an object is marked as bad, then an entry is corresponding to the
bad object is created in the .glusterfs/quarantine directory to help
scrub status. The entry name is the gfid of the corrupted object.
The quarantine handle is removed in below 2 cases.

1) When protocol/server revceives the -ve lookup on an entry whose inode
   is there in the inode table (it can happen when the corrupted object
   is deleted directly from the backend for recovery purpose) it sends a
   forget on the inode and bit-rot-stub removes the quarantine handle in
   upon getting the forget.
   refer to the below commit
   f853ed9c61bf65cb39f859470a8ffe8973818868:
   http://review.gluster.org/12743)

2) When bit-rot-stub itself realizes that lookup on a corrupted object
   has failed with ENOENT.

But with step1, there is a problem when the bit-rot-stub receives forget
due to lru limit exceeding in the inode table. In such cases, though the
corrupted object is not deleted (either from the mount point or from the
backend), the handle in the quarantine directory is removed and that object
is not shown in the bad objects list in the scrub status command.

So it is better to follow only 2nd step (i.e. bit-rot-stub removing the handle
from the quarantine directory in -ve lookups). Also the handle has to be removed
when a corrupted object is unlinked from the mount point itself.

Change-Id: Ibc3bbaf4bc8a5f8986085e87b729ab912cbf8cf9
Signed-off-by: Raghavendra Bhat <raghavendra@xxxxxxxxxx>
---
 xlators/features/bit-rot/src/stub/bit-rot-stub.c | 103 +++++++++++++++++++++--
 1 file changed, 95 insertions(+), 8 deletions(-)

diff --git a/xlators/features/bit-rot/src/stub/bit-rot-stub.c b/xlators/features/bit-rot/src/stub/bit-rot-stub.c
index 9c0f622..920aa7c 100644
--- a/xlators/features/bit-rot/src/stub/bit-rot-stub.c
+++ b/xlators/features/bit-rot/src/stub/bit-rot-stub.c
@@ -2841,19 +2841,46 @@ unwind:
 
 /** {{{ */
 
-/* forget() */
+/* unlink() */
 
 int
-br_stub_forget (xlator_t *this, inode_t *inode)
+br_stub_unlink_cbk (call_frame_t *frame, void *cookie, xlator_t *this,
+                   int32_t op_ret, int32_t op_errno, struct iatt *preparent,
+                   struct iatt *postparent, dict_t *xdata)
 {
-        uint64_t ctx_addr = 0;
-        br_stub_inode_ctx_t *ctx = NULL;
+        br_stub_local_t     *local    = NULL;
+        inode_t             *inode    = NULL;
+        uint64_t             ctx_addr = 0;
+        br_stub_inode_ctx_t *ctx      = NULL;
+        int32_t              ret      = -1;
 
-        inode_ctx_del (inode, this, &ctx_addr);
-        if (!ctx_addr)
-                return 0;
+        if (op_ret < 0)
+                goto unwind;
 
-        ctx = (br_stub_inode_ctx_t *) (long) ctx_addr;
+        local = frame->local;
+        frame->local = NULL;
+
+        inode = local->u.context.inode;
+
+        ret = br_stub_get_inode_ctx (this, inode, &ctx_addr);
+        if (ret) {
+                /**
+                 * If the inode is bad AND context is not there, then there
+                 * is a possibility of the gfid of the object being listed
+                 * in the quarantine directory and will be shown in the
+                 * bad objects list. So continuing with the fop with a
+                 * warning log. The entry from the quarantine directory
+                 * has to be removed manually. Its not a good idea to fail
+                 * the fop, as the object has already been deleted.
+                 */
+                gf_msg (this->name, GF_LOG_WARNING, 0,
+                        BRS_MSG_GET_INODE_CONTEXT_FAILED,
+                        "failed to get the context for the inode %s",
+                        uuid_utoa (inode->gfid));
+                goto unwind;
+        }
+
+        ctx = (br_stub_inode_ctx_t *)(long)ctx_addr;
 
         LOCK (&inode->lock);
         {
@@ -2868,6 +2895,65 @@ br_stub_forget (xlator_t *this, inode_t *inode)
         }
         UNLOCK (&inode->lock);
 
+unwind:
+        STACK_UNWIND_STRICT (unlink, frame, op_ret, op_errno, preparent, postparent, xdata);
+        br_stub_cleanup_local (local);
+        br_stub_dealloc_local (local);
+        return 0;
+}
+
+int
+br_stub_unlink (call_frame_t *frame, xlator_t *this, loc_t *loc, int flag,
+                dict_t *xdata)
+{
+        br_stub_local_t *local = NULL;
+        int32_t          op_ret = -1;
+        int32_t          op_errno = 0;
+
+        local = br_stub_alloc_local (this);
+        if (!local) {
+                op_ret = -1;
+                op_errno = ENOMEM;
+                gf_msg (this->name, GF_LOG_ERROR, ENOMEM, BRS_MSG_NO_MEMORY,
+                        "failed to allocate memory for local (path: %s, gfid: %s)",
+                        loc->path, uuid_utoa (loc->inode->gfid));
+                goto unwind;
+        }
+
+        br_stub_fill_local (local, NULL, NULL, loc->inode,
+                            loc->inode->gfid,
+                            BR_STUB_NO_VERSIONING, 0);
+
+        frame->local = local;
+
+        STACK_WIND (frame, br_stub_unlink_cbk, FIRST_CHILD (this),
+                    FIRST_CHILD (this)->fops->unlink, loc, flag, xdata);
+        return 0;
+
+unwind:
+        STACK_UNWIND_STRICT (unlink, frame, op_ret, op_errno, NULL, NULL, NULL);
+        return 0;
+}
+
+
+/** }}} */
+
+/** {{{ */
+
+/* forget() */
+
+int
+br_stub_forget (xlator_t *this, inode_t *inode)
+{
+        uint64_t ctx_addr = 0;
+        br_stub_inode_ctx_t *ctx = NULL;
+
+        inode_ctx_del (inode, this, &ctx_addr);
+        if (!ctx_addr)
+                return 0;
+
+        ctx = (br_stub_inode_ctx_t *) (long) ctx_addr;
+
         GF_FREE (ctx);
 
         return 0;
@@ -3133,6 +3219,7 @@ struct xlator_fops fops = {
         .setxattr  = br_stub_setxattr,
         .opendir   = br_stub_opendir,
         .readdir   = br_stub_readdir,
+        .unlink    = br_stub_unlink,
 };
 
 struct xlator_cbks cbks = {
-- 
2.5.0

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel