Re: Error after crash of Virtual Machine during migration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 12/10/2013 02:59 AM, Mariusz Sobisiak wrote:
Greetings,

Legend:
storage-gfs-3-prd - the first gluster.
What's a "gluster"?
storage-1-saas - new gluster where "the first gluster" had to be
migrated.
storage-gfs-4-prd - the second gluster (which had to be migrated later).
What do you mean "migrated"?
I've started command replace-brick:
'gluster volume replace-brick sa_bookshelf storage-gfs-3-prd:/ydp/shared
storage-1-saas:/ydp/shared start'

During that Virtual Machine (Xen) has crashed. Now I can't abort
migration and continue it again.
I don't know what state that leaves your files in. I think the original brick, "storage-gfs-3-prd:/ydp/shared", should still have all the data.

The rest of the problem has to do with settings in /var/lib/glusterd/sa_bookshelf/info. Make a backup of that file and edit it, removing anything to do with replace-brick, or rebalance. Feel free to put the info file on fpaste.org and ping me on IRC if you need help with that. Stop the volume and glusterd. Copy that same edited info file to the same path on both servers. Start glusterd again. That should clear the replace-brick status so you can try again with 3.4.2.

When I try:
'# gluster volume replace-brick sa_bookshelf
storage-gfs-3-prd:/ydp/shared storage-1-saas:/ydp/shared abort'
The command lasts about 5 minutes then finishes with no results. Apart
from that Gluster after that command starts behave very strange.
For example I can't do '# gluster volume heal sa_bookshelf info' because
it lasts about 5 minutes and returns black screen (the same like abort).

Then I restart Gluster server and Gluster returns to normal work except
the replace-brick commands. When I do:
'# gluster volume replace-brick sa_bookshelf
storage-gfs-3-prd:/ydp/shared storage-1-saas:/ydp/shared status'
I get:
Number of files migrated = 0       Current file=
I can do 'volume heal info' commands etc. until I call the command:
'# gluster volume replace-brick sa_bookshelf
storage-gfs-3-prd:/ydp/shared storage-1-saas:/ydp/shared abort'.



# gluster --version
glusterfs 3.3.1 built on Oct 22 2012 07:54:24 Repository revision:
git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com> GlusterFS
comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU
General Public License.

Brick (/ydp/shared) logs (repeats the same constantly):
[2013-12-06 11:29:44.790299] W [dict.c:995:data_to_str]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_connect+0xab
) [0x7ff4a5d35fcb]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_client_get_r
emote_sockaddr+0x15d) [0x7ff4a5d3d64d]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(client_fill_address
_family+0x2bb) [0x7ff4a5d3d4ab]))) 0-dict: data is NULL
[2013-12-06 11:29:44.790402] W [dict.c:995:data_to_str]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_connect+0xab
) [0x7ff4a5d35fcb]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_client_get_r
emote_sockaddr+0x15d) [0x7ff4a5d3d64d]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(client_fill_address
_family+0x2c6) [0x7ff4a5d3d4b6]))) 0-dict: data is NULL
[2013-12-06 11:29:44.790465] E [name.c:141:client_fill_address_family]
0-sa_bookshelf-replace-brick: transport.address-family not specified.
Could not guess default value from (remote-host:(null) or
transport.unix.connect-path:(null)) options
[2013-12-06 11:29:47.791037] W [dict.c:995:data_to_str]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_connect+0xab
) [0x7ff4a5d35fcb]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_client_get_r
emote_sockaddr+0x15d) [0x7ff4a5d3d64d]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(client_fill_address
_family+0x2bb) [0x7ff4a5d3d4ab]))) 0-dict: data is NULL
[2013-12-06 11:29:47.791141] W [dict.c:995:data_to_str]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_connect+0xab
) [0x7ff4a5d35fcb]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_client_get_r
emote_sockaddr+0x15d) [0x7ff4a5d3d64d]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(client_fill_address
_family+0x2c6) [0x7ff4a5d3d4b6]))) 0-dict: data is NULL
[2013-12-06 11:29:47.791174] E [name.c:141:client_fill_address_family]
0-sa_bookshelf-replace-brick: transport.address-family not specified.
Could not guess default value from (remote-host:(null) or
transport.unix.connect-path:(null)) options
[2013-12-06 11:29:50.791775] W [dict.c:995:data_to_str]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_connect+0xab
) [0x7ff4a5d35fcb]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_client_get_r
emote_sockaddr+0x15d) [0x7ff4a5d3d64d]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(client_fill_address
_family+0x2bb) [0x7ff4a5d3d4ab]))) 0-dict: data is NULL
[2013-12-06 11:29:50.791986] W [dict.c:995:data_to_str]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_connect+0xab
) [0x7ff4a5d35fcb]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(socket_client_get_r
emote_sockaddr+0x15d) [0x7ff4a5d3d64d]
(-->/usr/lib/glusterfs/3.3.1/rpc-transport/socket.so(client_fill_address
_family+0x2c6) [0x7ff4a5d3d4b6]))) 0-dict: data is NULL
[2013-12-06 11:29:50.792046] E [name.c:141:client_fill_address_family]
0-sa_bookshelf-replace-brick: transport.address-family not specified.
Could not guess default value from (remote-host:(null) or
transport.unix.connect-path:(null)) options


# gluster volume info

Volume Name: sa_bookshelf
Type: Distributed-Replicate
Volume ID: 74512f52-72ec-4538-9a54-4e50c4691722
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: storage-gfs-3-prd:/ydp/shared
Brick2: storage-gfs-4-prd:/ydp/shared
Brick3: storage-gfs-3-prd:/ydp/shared2
Brick4: storage-gfs-4-prd:/ydp/shared2


# gluster volume status
Status of volume: sa_bookshelf
Gluster process                                         Port    Online
Pid
------------------------------------------------------------------------
------
Brick storage-gfs-3-prd:/ydp/shared                     24009   Y
758
Brick storage-gfs-4-prd:/ydp/shared                     24009   Y
730
Brick storage-gfs-3-prd:/ydp/shared2                    24010   Y
764
Brick storage-gfs-4-prd:/ydp/shared2                    24010   Y
4578
NFS Server on localhost                                 38467   Y
770
Self-heal Daemon on localhost                           N/A     Y
776
NFS Server on storage-1-saas                            38467   Y
840
Self-heal Daemon on storage-1-saas                      N/A     Y
846
NFS Server on storage-gfs-4-prd                         38467   Y
4584
Self-heal Daemon on storage-gfs-4-prd                   N/A     Y
4590

storage-gfs-3-prd:~# gluster peer status Number of Peers: 2

Hostname: storage-1-saas
Uuid: 37b9d881-ce24-4550-b9de-6b304d7e9d07
State: Peer in Cluster (Connected)

Hostname: storage-gfs-4-prd
Uuid: 4c384f45-873b-4c12-9683-903059132c56
State: Peer in Cluster (Connected)


(from storage-1-saas)# gluster peer status Number of Peers: 2

Hostname: 172.16.3.60
Uuid: 1441a7b0-09d2-4a40-a3ac-0d0e546f6884
State: Peer in Cluster (Connected)

Hostname: storage-gfs-4-prd
Uuid: 4c384f45-873b-4c12-9683-903059132c56
State: Peer in Cluster (Connected)



Clients work properly.
I googled for that but I found that was a bug but in 3.3.0 version. How
can I repair that and continue my migration? Thank You for any help.

BTW: I moved Gluster Server via Gluster 3.4: Brick Restoration - Replace
Crashed Server how to.

Regards,
Mariusz
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users




[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux