Re: Any review is appreciated. Reason about gluster server_connection_cleanup uncleanly, file flocks leaks in frequently network disconnection

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

Finally, I figure out the Gerrit and submit a BUG and patch.
Hope I had followed a correct procedures, :-)

Bug description:
https://bugzilla.redhat.com/show_bug.cgi?id=1144672

Patch review request:
http://review.gluster.org/#/c/8787/

Please review this patch when you are free to do it.
Thanks very much!

On Friday, September 19, 2014, Vijay Bellur <vbellur@xxxxxxxxxx> wrote:
Hi Jaden,

Thanks for your effort on this one.

A lot of GlusterFS developers including me are busy with various things over the next 2 weeks. Please do expect some delays in our responses and reviews till then.

Cheers,
Vijay

On 09/19/2014 12:44 PM, Jaden Liang wrote:
Hi all,

Here is a patch for this file flocks uncleanly disconnect issue of
gluster-3.4.5.
I am totally new guy in the gluster development work flow, and still
trying to
understand how to submit this patch to Gerrit. So I want to paste the patch
here first to let devel team know, and submit it after I figure out the
Gerrit :-).

The major modification is adding an id for different tcp connection
between a
pair client and server to avoid a connection socket not close at the
same time.

diff --git a/rpc/rpc-lib/src/rpc-clnt.h b/rpc/rpc-lib/src/rpc-clnt.h
index 263d5f7..718308d 100644
--- a/rpc/rpc-lib/src/rpc-clnt.h
+++ b/rpc/rpc-lib/src/rpc-clnt.h
@@ -143,6 +143,7 @@ struct rpc_clnt_connection {
struct timeval           last_sent;
struct timeval           last_received;
int32_t                  ping_started;
+        uint32_t             clnt_conn_id;
  };
  typedef struct rpc_clnt_connection rpc_clnt_connection_t;
diff --git a/xlators/protocol/client/src/client-handshake.c
b/xlators/protocol/client/src/client-handshake.c
index d2083e6..1c2fc2f 100644
--- a/xlators/protocol/client/src/client-handshake.c
+++ b/xlators/protocol/client/src/client-handshake.c
@@ -471,9 +471,10 @@ client_set_lk_version (xlator_t *this)
          conf = (clnt_conf_t *) this->private;
          req.lk_ver = client_get_lk_ver (conf);
-        ret = gf_asprintf (&req.uid, "%s-%s-%d",
+        ret = gf_asprintf (&req.uid, "%s-%s-%d-%u",
                             this->ctx->process_uuid, this->name,
-                           this->graph->id);
+                           this->graph->id,
+                           (conf->rpc) ? conf->rpc->conn.clnt_conn_id : 0);
          if (ret == -1)
                  goto err;
@@ -1549,13 +1550,22 @@ client_setvolume (xlator_t *this, struct
rpc_clnt *rpc)
                  }
          }
+        /* For different connections between a pair client and server,
we use a
+         * different clnt_conn_id to identify. Otherwise, there are
some chances
+         * lead to flocks not released in a uncleanly disconnection.
+         * */
+        if (conf->rpc) {
+                conf->rpc->conn.clnt_conn_id = conf->clnt_conn_id++;
+        }
+
          /* With multiple graphs possible in the same process, we need a
             field to bring the uniqueness. Graph-ID should be enough to
get the
             job done
          */
-        ret = gf_asprintf (&process_uuid_xl, "%s-%s-%d",
+        ret = gf_asprintf (&process_uuid_xl, "%s-%s-%d-%u",
                             this->ctx->process_uuid, this->name,
-                           this->graph->id);
+                           this->graph->id,
+                           (conf->rpc) ? conf->rpc->conn.clnt_conn_id : 0);
          if (-1 == ret) {
                  gf_log (this->name, GF_LOG_ERROR,
                          "asprintf failed while setting process_uuid");
diff --git a/xlators/protocol/client/src/client.c
b/xlators/protocol/client/src/client.c
index ad95574..35fef49 100644
--- a/xlators/protocol/client/src/client.c
+++ b/xlators/protocol/client/src/client.c
@@ -2437,6 +2437,7 @@ init (xlator_t *this)
          conf->lk_version         = 1;
          conf->grace_timer        = NULL;
          conf->grace_timer_needed = _gf_true;
+        conf->clnt_conn_id       = 0;
          ret = client_init_grace_timer (this, this->options, conf);
          if (ret)
diff --git a/xlators/protocol/client/src/client.h
b/xlators/protocol/client/src/client.h
index 0a27c09..dea90d1 100644
--- a/xlators/protocol/client/src/client.h
+++ b/xlators/protocol/client/src/client.h
@@ -116,6 +116,9 @@ typedef struct clnt_conf {
*/
          gf_boolean_t           filter_o_direct; /* if set, filter
O_DIRECT from
                                                     the flags list of
open() */
+        uint32_t               clnt_conn_id; /* connection id for each
connection
+                                                in process_uuid, start
with 0,
+                                                increase once a new
connection */
  } clnt_conf_t;
  typedef struct _client_fd_ctx {


On Wednesday, September 17, 2014, Jaden Liang <jaden1q84@xxxxxxxxx
<mailto:jaden1q84@xxxxxxxxx>> wrote:

    Hi all,

    By several days tracking, we finally pinpointed the reason of
    glusterfs uncleanly
    detach file flocks in frequently network disconnection. We are now
    working on
    a patch to submit. And here is this issue details. Any suggestions
    will be
    appreciated!

    First of all, as I mentioned in
    http://supercolony.gluster.org/pipermail/gluster-devel/2014-September/042233.html
    This issue happens in a frequently network disconnection.

    According to the sources, the server cleanup jobs is in
    server_connection_cleanup.
    When the RPCSVC_EVENT_DISCONNECT happens, it will come here:

    int
    server_rpc_notify ()
    {
    ......
            case RPCSVC_EVENT_DISCONNECT:
    ......
                     if (!conf->lk_heal) {
                             server_conn_ref (conn);
                             server_connection_put (this, conn, &detached);
                             if (detached)
                                     server_connection_cleanup (this, conn,

      INTERNAL_LOCKS |

      POSIX_LOCKS);
                             server_conn_unref (conn);
    ......
    }

    The server_connection_cleanup() will be called while variable
    'detached' is true.
    And the 'detached' is set by server_connection_put():
    server_connection_t*
    server_connection_put (xlator_t *this, server_connection_t *conn,
                            gf_boolean_t *detached)
    {
             server_conf_t       *conf = NULL;
             gf_boolean_t        unref = _gf_false;

             if (detached)
                     *detached = _gf_false;
             conf = this->private;
             pthread_mutex_lock (&conf->mutex);
             {
                     conn->bind_ref--;
                     if (!conn->bind_ref) {
                             list_del_init (&conn->list);
                             unref = _gf_true;
                     }
             }
             pthread_mutex_unlock (&conf->mutex);
             if (unref) {
                     gf_log (this->name, GF_LOG_INFO, "Shutting down
    connection %s",
                             conn->id);
                     if (detached)
                             *detached = _gf_true;
                     server_conn_unref (conn);
                     conn = NULL;
             }
             return conn;
    }

    The 'detached' is only set _gf_true when 'conn->bind_ref' decrease
    to 0.
    This 'conn->bind_ref' is set in server_connection_get(), increase or
    set to 1.

    server_connection_t *
    server_connection_get (xlator_t *this, const char *id)
    {
    ......
                     list_for_each_entry (trav, &conf->conns, list) {
                             if (!strcmp (trav->id, id)) {
                                     conn = trav;
                                     conn->bind_ref++;
                                     goto unlock;
                             }
                     }
    ......
    }

    When the connection id is same, then the 'conn->bind_ref' will be
    increased.
    Therefore, the problem should be a reference mismatch increase or
    decrease. Then
    we add some logs to verify our guess.

    // 1st connection comes in. and there is no id
    'host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0'
    in the connection table. The 'conn->bind_ref' is set to 1.
    [2014-09-17 04:42:28.950693] D
    [server-helpers.c:712:server_connection_get] 0-vs_vol_rep2-server:
    server connection id:
    host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0,
    conn->bind_ref:1, found:0
    [2014-09-17 04:42:28.950717] D
    [server-handshake.c:430:server_setvolume] 0-vs_vol_rep2-server:
    Connected to
    host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0
    [2014-09-17 04:42:28.950758] I
    [server-handshake.c:567:server_setvolume] 0-vs_vol_rep2-server:
    accepted client from
    host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0
    (version: 3.4.5) (peer: host-000c29e93d20:1015)
    ......
    // Keep running several minutes.......
    ......
    // Network disconnected here. The TCP socket of client side is
    disconnected by
    time-out, by the server-side socket still keep connected. AT THIS
    MOMENT,
    network restore. Client side reconnect a new TCP connection JUST
    BEFORE the
    last socket on server-side is reset. Note that at this point, there
    is 2 valid
    sockets on server side. The later new connection use the same conn
    id 'host-000
    c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0'
    look up in the
    connection table and increase the 'conn->bind_ref' to 2.

    [2014-09-17 04:46:16.135066] D
    [server-helpers.c:712:server_connection_get] 0-vs_vol_rep2-server:
    server connection id:
    host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0,
    conn->bind_ref:2, found:1 // HERE IT IS, ref increase to 2!!!
    [2014-09-17 04:46:16.135113] D
    [server-handshake.c:430:server_setvolume] 0-vs_vol_rep2-server:
    Connected to
    host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0
    [2014-09-17 04:46:16.135157] I
    [server-handshake.c:567:server_setvolume] 0-vs_vol_rep2-server:
    accepted client from
    host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0
    (version: 3.4.5) (peer: host-000c29e93d20:1018)

    // After 13 seconds, the old connection is reset, decrease the
    'conn->bind_ref' to 1.

    [2014-09-17 04:46:28.688780] W
    [socket.c:2121:__socket_proto_state_machine]
    0-tcp.vs_vol_rep2-server: ret = -1, error: Connection reset by peer,
    peer (host-000c29e93d20:1015)
    [2014-09-17 04:46:28.688790] I [socket.c:2274:socket_event_handler]
    0-transport: socket_event_poll_in failed, ret=-1.
    [2014-09-17 04:46:28.688797] D [socket.c:2281:socket_event_handler]
    0-transport: disconnecting now
    [2014-09-17 04:46:28.688831] I [server.c:762:server_rpc_notify]
    0-vs_vol_rep2-server: disconnecting connectionfrom
    host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0(host-000c29e93d20:1015)
    [2014-09-17 04:46:28.688861] D
    [server-helpers.c:744:server_connection_put] 0-vs_vol_rep2-server:
    conn->bind_ref:1

    In our production environment, there is some flocks in the 1st
    connection.
    According to the logs, there is no way to cleanup the flocks in the
    1st connection.
    And the 2nd new connection, the client-side can't flock again.

    Therefore, we think the major reason is different connections using
    the same conn id.
    The conn id is assembled in client_setvolume()

    ret = gf_asprintf (&process_uuid_xl, "%s-%s-%d",
                                this->ctx->process_uuid, this->name,
                                this->graph->id);

    The conn id contains 3 parts:
    this->ctx->process_uuid: hostname + pid + startup timestamp
    this->name: tranlator name
    this->graph->id: graph id

    It is apparently that the conn id is same unless the client side
    restart. So when
    network disconnects, there is some chance that socket on client side
    timeout and
    the one on server side is alive. At this moment, network restore,
    client reconnect
    before server old socket reset, that will cause the file flocks of
    old connection
    unclean.

    That is our total analysis of this flock leak issue. Now we are
    working on the patch.
    Hope someone could review it when it is finished.

    Any other comment is grateful, Thank you!



_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux