Re: Fedora upgrade to f24 installed 3.8.0 client and broke mounting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/25/2016 01:19 AM, Vijay Bellur wrote:
On 06/24/2016 02:12 PM, Alastair Neil wrote:
I upgraded my fedora 23 system to f24 a couple of days ago, now I am
unable to mount my gluster cluster.

The update installed:

glusterfs-3.8.0-1.fc24.x86_64
glusterfs-libs-3.8.0-1.fc24.x86_64
glusterfs-fuse-3.8.0-1.fc24.x86_64
glusterfs-client-xlators-3.8.0-1.fc24.x86_64

the gluster is running 3.7.11

The volume is replica 3

I see these errors in the mount log:

    [2016-06-24 17:55:34.016462] I [MSGID: 100030]
    [glusterfsd.c:2408:main] 0-/usr/sbin/glusterfs: Started running
    /usr/sbin/glusterfs version 3.8.0 (args: /usr/sbin/glusterfs
    --volfile-server=gluster1 --volfile-id=homes /mnt/homes)
    [2016-06-24 17:55:34.094345] I [MSGID: 101190]
    [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
    thread with index 1
    [2016-06-24 17:55:34.240135] I [MSGID: 101190]
    [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
    thread with index 2
    [2016-06-24 17:55:34.240130] I [MSGID: 101190]
    [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
    thread with index 4
    [2016-06-24 17:55:34.240130] I [MSGID: 101190]
    [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
    thread with index 3
    [2016-06-24 17:55:34.241499] I [MSGID: 114020]
    [client.c:2356:notify] 0-homes-client-2: parent translators are
    ready, attempting connect on transport
    [2016-06-24 17:55:34.249172] I [MSGID: 114020]
    [client.c:2356:notify] 0-homes-client-5: parent translators are
    ready, attempting connect on transport
    [2016-06-24 17:55:34.250186] I [rpc-clnt.c:1855:rpc_clnt_reconfig]
    0-homes-client-2: changing port to 49171 (from 0)
    [2016-06-24 17:55:34.253347] I [MSGID: 114020]
    [client.c:2356:notify] 0-homes-client-6: parent translators are
    ready, attempting connect on transport
    [2016-06-24 17:55:34.254213] I [rpc-clnt.c:1855:rpc_clnt_reconfig]
    0-homes-client-5: changing port to 49154 (from 0)
    [2016-06-24 17:55:34.255115] I [MSGID: 114057]
    [client-handshake.c:1441:select_server_supported_programs]
    0-homes-client-2: Using Program GlusterFS 3.3, Num (1298437),
    Version (330)
    [2016-06-24 17:55:34.255861] W [MSGID: 114007]
    [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-2:
    failed to find key 'child_up' in the options
    [2016-06-24 17:55:34.259097] I [MSGID: 114057]
    [client-handshake.c:1441:select_server_supported_programs]
    0-homes-client-5: Using Program GlusterFS 3.3, Num (1298437),
    Version (330)
    Final graph:
+------------------------------------------------------------------------------+
      1: volume homes-client-2
      2:     type protocol/client
      3:     option clnt-lk-version 1
      4:     option volfile-checksum 0
      5:     option volfile-key homes
      6:     option client-version 3.8.0
      7:     option process-uuid
    Island-29185-2016/06/24-17:55:34:10054-homes-client-2-0-0
      8:     option fops-version 1298437
      9:     option ping-timeout 20
     10:     option remote-host gluster-2
     11:     option remote-subvolume /export/brick2/home
     12:     option transport-type socket
     13:     option event-threads 4
     14:     option send-gids true
     15: end-volume
     16:
     17: volume homes-client-5
     18:     type protocol/client
     19:     option clnt-lk-version 1
     20:     option volfile-checksum 0
     21:     option volfile-key homes
     22:     option client-version 3.8.0
     23:     option process-uuid
    Island-29185-2016/06/24-17:55:34:10054-homes-client-5-0-0
     24:     option fops-version 1298437
     25:     option ping-timeout 20
     26:     option remote-host gluster1.vsnet.gmu.edu
    <http://gluster1.vsnet.gmu.edu>
     27:     option remote-subvolume /export/brick2/home
     28:     option transport-type socket
     29:     option event-threads 4
     30:     option send-gids true
     31: end-volume
     32:
     33: volume homes-client-6
     34:     type protocol/client
     35:     option ping-timeout 20
     36:     option remote-host gluster0
     37:     option remote-subvolume /export/brick2/home
     38:     option transport-type socket
     39:     option event-threads 4
     40:     option send-gids true
     41: end-volume
     42:
     43: volume homes-replicate-0
     44:     type cluster/replicate
     45:     option background-self-heal-count 20
     46:     option metadata-self-heal on
     47:     option data-self-heal off
     48:     option entry-self-heal on
     49:     option data-self-heal-window-size 8
     50:     option data-self-heal-algorithm diff
     51:     option eager-lock on
     52:     option quorum-type auto
     53:     option self-heal-readdir-size 64KB
     54:     subvolumes homes-client-2 homes-client-5 homes-client-6
     55: end-volume
     56:
     57: volume homes-dht
     58:     type cluster/distribute
     59:     option min-free-disk 5%
     60:     option rebalance-stats on
     61:     option readdir-optimize on
     62:     subvolumes homes-replicate-0
     63: end-volume
     64:
     65: volume homes-read-ahead
     66:     type performance/read-ahead
     67:     subvolumes homes-dht
     68: end-volume
     69:
     70: volume homes-io-cache
     71:     type performance/io-cache
     72:     subvolumes homes-read-ahead
     73: end-volume
     74:
     75: volume homes-quick-read
     76:     type performance/quick-read
     77:     subvolumes homes-io-cache
     78: end-volume
     79:
     80: volume homes-open-behind
     81:     type performance/open-behind
     82:     subvolumes homes-quick-read
     83: end-volume
     84:
     85: volume homes-md-cache
     86:     type performance/md-cache
     87:     subvolumes homes-open-behind
     88: end-volume
     89:
     90: volume homes
     91:     type debug/io-stats
     92:     option log-level INFO
     93:     option latency-measurement off
     94:     option count-fop-hits on
     95:     subvolumes homes-md-cache
     96: end-volume
     97:
     98: volume meta-autoload
     99:     type meta
    100:     subvolumes homes
    101: end-volume
    102:
+------------------------------------------------------------------------------+
    [2016-06-24 17:55:34.261219] I [rpc-clnt.c:1855:rpc_clnt_reconfig]
    0-homes-client-6: changing port to 49153 (from 0)
    [2016-06-24 17:55:34.266096] I [MSGID: 114057]
    [client-handshake.c:1441:select_server_supported_programs]
    0-homes-client-6: Using Program GlusterFS 3.3, Num (1298437),
    Version (330)
    [2016-06-24 17:55:34.266905] W [MSGID: 114007]
    [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-6:
    failed to find key 'child_up' in the options
    [2016-06-24 17:55:34.273618] W [MSGID: 114007]
    [client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-5:
    failed to find key 'child_up' in the options




I checked the release notes for 3.8.0 but I did not see any caveats or
compatibility warnings.

Anyone else seeing issues with 3.8 clients mounting 3.7 volumes?


Seems like it is due to this commit:

commit 2bfdc30e0e7fba6f97d8829b2618a1c5907dc404
Author: Avra Sengupta
Date:   Mon Feb 29 14:43:58 2016 +0530

    protocol client/server: Fix client-server handshake

This commit introduced a new check to determine the existence of a key in the dictionary that gets exchanged between clients and servers during a handshake. Upon not finding the key, the clients bail out.

Avra - would it be possible to avoid a hard check of 'child_up' during a handshake?
Yes Vijay, This particular failure is because the client is expecting a 'child_up' from the server during a handshake, to determine if all children in the server are up and it's not just a handshake. Although this is the ideal behaviour in which the handshake should work, it is currently breaking backward compatibility with 3.7 volumes, as those servers are not sending the appropriate key which the newer client is expecting.

I would prefer not to bypass this check in the client, but rather enforce this check only for connections comming from servers running 3.8.

+ Adding Raghavendra Gowdappa

Raghavendra,

Would it be possible to keep this check in the client specific to servers running on 3.8 and beyond.

Note that if servers are upgraded ahead of the clients, this problem should not be seen.

Thanks,
Vijay



_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users



[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux