----- Original Message ----- | From: "Pranith Kumar Karampuri" <pkarampu@xxxxxxxxxx> | To: "Roman" <romeo.r@xxxxxxxxx> | Cc: gluster-users@xxxxxxxxxxx, "Niels de Vos" <ndevos@xxxxxxxxxx>, "Humble Chirammal" <hchiramm@xxxxxxxxxx> | Sent: Wednesday, August 6, 2014 12:09:57 PM | Subject: Re: libgfapi failover problem on replica bricks | | Roman, | The file went into split-brain. I think we should do these tests | with 3.5.2. Where monitoring the heals is easier. Let me also come up | with a document about how to do this testing you are trying to do. | | Humble/Niels, | Do we have debs available for 3.5.2? In 3.5.1 there was packaging | issue where /usr/bin/glfsheal is not packaged along with the deb. I | think that should be fixed now as well? | Pranith, The 3.5.2 packages for debian is not available yet. We are co-ordinating internally to get it processed. I will update the list once its available. --Humble | | On 08/06/2014 11:52 AM, Roman wrote: | > good morning, | > | > root@stor1:~# getfattr -d -m. -e hex | > /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2 | > getfattr: Removing leading '/' from absolute path names | > # file: exports/fast-test/150G/images/127/vm-127-disk-1.qcow2 | > trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000 | > trusted.afr.HA-fast-150G-PVE1-client-1=0x000001320000000000000000 | > trusted.gfid=0x23c79523075a4158bea38078da570449 | > | > getfattr: Removing leading '/' from absolute path names | > # file: exports/fast-test/150G/images/127/vm-127-disk-1.qcow2 | > trusted.afr.HA-fast-150G-PVE1-client-0=0x000000040000000000000000 | > trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000 | > trusted.gfid=0x23c79523075a4158bea38078da570449 | > | > | > | > 2014-08-06 9:20 GMT+03:00 Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx | > <mailto:pkarampu@xxxxxxxxxx>>: | > | > | > On 08/06/2014 11:30 AM, Roman wrote: | >> Also, this time files are not the same! | >> | >> root@stor1:~# md5sum | >> /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2 | >> 32411360c53116b96a059f17306caeda | >> /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2 | >> | >> root@stor2:~# md5sum | >> /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2 | >> 65b8a6031bcb6f5fb3a11cb1e8b1c9c9 | >> /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2 | > What is the getfattr output? | > | > Pranith | > | >> | >> | >> 2014-08-05 16:33 GMT+03:00 Roman <romeo.r@xxxxxxxxx | >> <mailto:romeo.r@xxxxxxxxx>>: | >> | >> Nope, it is not working. But this time it went a bit other way | >> | >> root@gluster-client:~# dmesg | >> Segmentation fault | >> | >> | >> I was not able even to start the VM after I done the tests | >> | >> Could not read qcow2 header: Operation not permitted | >> | >> And it seems, it never starts to sync files after first | >> disconnect. VM survives first disconnect, but not second (I | >> waited around 30 minutes). Also, I've | >> got network.ping-timeout: 2 in volume settings, but logs | >> react on first disconnect around 30 seconds. Second was | >> faster, 2 seconds. | >> | >> Reaction was different also: | >> | >> slower one: | >> [2014-08-05 13:26:19.558435] W [socket.c:514:__socket_rwv] | >> 0-glusterfs: readv failed (Connection timed out) | >> [2014-08-05 13:26:19.558485] W | >> [socket.c:1962:__socket_proto_state_machine] 0-glusterfs: | >> reading from socket failed. Error (Connection timed out), | >> peer (10.250.0.1:24007 <http://10.250.0.1:24007>) | >> [2014-08-05 13:26:21.281426] W [socket.c:514:__socket_rwv] | >> 0-HA-fast-150G-PVE1-client-0: readv failed (Connection timed out) | >> [2014-08-05 13:26:21.281474] W | >> [socket.c:1962:__socket_proto_state_machine] | >> 0-HA-fast-150G-PVE1-client-0: reading from socket failed. | >> Error (Connection timed out), peer (10.250.0.1:49153 | >> <http://10.250.0.1:49153>) | >> [2014-08-05 13:26:21.281507] I | >> [client.c:2098:client_rpc_notify] | >> 0-HA-fast-150G-PVE1-client-0: disconnected | >> | >> the fast one: | >> 2014-08-05 12:52:44.607389] C | >> [client-handshake.c:127:rpc_client_ping_timer_expired] | >> 0-HA-fast-150G-PVE1-client-1: server 10.250.0.2:49153 | >> <http://10.250.0.2:49153> has not responded in the last 2 | >> seconds, disconnecting. | >> [2014-08-05 12:52:44.607491] W [socket.c:514:__socket_rwv] | >> 0-HA-fast-150G-PVE1-client-1: readv failed (No data available) | >> [2014-08-05 12:52:44.607585] E | >> [rpc-clnt.c:368:saved_frames_unwind] | >> (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8) | >> [0x7fcb1b4b0558] | >> (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3) | >> [0x7fcb1b4aea63] | >> (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe) | >> [0x7fcb1b4ae97e]))) 0-HA-fast-150G-PVE1-client-1: forced | >> unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at | >> 2014-08-05 12:52:42.463881 (xid=0x381883x) | >> [2014-08-05 12:52:44.607604] W | >> [client-rpc-fops.c:2624:client3_3_lookup_cbk] | >> 0-HA-fast-150G-PVE1-client-1: remote operation failed: | >> Transport endpoint is not connected. Path: / | >> (00000000-0000-0000-0000-000000000001) | >> [2014-08-05 12:52:44.607736] E | >> [rpc-clnt.c:368:saved_frames_unwind] | >> (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8) | >> [0x7fcb1b4b0558] | >> (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3) | >> [0x7fcb1b4aea63] | >> (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe) | >> [0x7fcb1b4ae97e]))) 0-HA-fast-150G-PVE1-client-1: forced | >> unwinding frame type(GlusterFS Handshake) op(PING(3)) called | >> at 2014-08-05 12:52:42.463891 (xid=0x381884x) | >> [2014-08-05 12:52:44.607753] W | >> [client-handshake.c:276:client_ping_cbk] | >> 0-HA-fast-150G-PVE1-client-1: timer must have expired | >> [2014-08-05 12:52:44.607776] I | >> [client.c:2098:client_rpc_notify] | >> 0-HA-fast-150G-PVE1-client-1: disconnected | >> | >> | >> | >> I've got SSD disks (just for an info). | >> Should I go and give a try for 3.5.2? | >> | >> | >> | >> 2014-08-05 13:06 GMT+03:00 Pranith Kumar Karampuri | >> <pkarampu@xxxxxxxxxx <mailto:pkarampu@xxxxxxxxxx>>: | >> | >> reply along with gluster-users please :-). May be you are | >> hitting 'reply' instead of 'reply all'? | >> | >> Pranith | >> | >> On 08/05/2014 03:35 PM, Roman wrote: | >>> To make sure and clean, I've created another VM with raw | >>> format and goint to repeat those steps. So now I've got | >>> two VM-s one with qcow2 format and other with raw | >>> format. I will send another e-mail shortly. | >>> | >>> | >>> 2014-08-05 13:01 GMT+03:00 Pranith Kumar Karampuri | >>> <pkarampu@xxxxxxxxxx <mailto:pkarampu@xxxxxxxxxx>>: | >>> | >>> | >>> On 08/05/2014 03:07 PM, Roman wrote: | >>>> really, seems like the same file | >>>> | >>>> stor1: | >>>> a951641c5230472929836f9fcede6b04 | >>>> /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2 | >>>> | >>>> stor2: | >>>> a951641c5230472929836f9fcede6b04 | >>>> /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2 | >>>> | >>>> | >>>> one thing I've seen from logs, that somehow proxmox | >>>> VE is connecting with wrong version to servers? | >>>> [2014-08-05 09:23:45.218550] I | >>>> [client-handshake.c:1659:select_server_supported_programs] | >>>> 0-HA-fast-150G-PVE1-client-0: Using Program | >>>> GlusterFS 3.3, Num (1298437), Version (330) | >>> It is the rpc (over the network data structures) | >>> version, which is not changed at all from 3.3 so | >>> thats not a problem. So what is the conclusion? Is | >>> your test case working now or not? | >>> | >>> Pranith | >>> | >>>> but if I issue: | >>>> root@pve1:~# glusterfs -V | >>>> glusterfs 3.4.4 built on Jun 28 2014 03:44:57 | >>>> seems ok. | >>>> | >>>> server use 3.4.4 meanwhile | >>>> [2014-08-05 09:23:45.117875] I | >>>> [server-handshake.c:567:server_setvolume] | >>>> 0-HA-fast-150G-PVE1-server: accepted client from | >>>> stor1-9004-2014/08/05-09:23:45:93538-HA-fast-150G-PVE1-client-1-0 | >>>> (version: 3.4.4) | >>>> [2014-08-05 09:23:49.103035] I | >>>> [server-handshake.c:567:server_setvolume] | >>>> 0-HA-fast-150G-PVE1-server: accepted client from | >>>> stor1-8998-2014/08/05-09:23:45:89883-HA-fast-150G-PVE1-client-0-0 | >>>> (version: 3.4.4) | >>>> | >>>> if this could be the reason, of course. | >>>> I did restart the Proxmox VE yesterday (just for an | >>>> information) | >>>> | >>>> | >>>> | >>>> | >>>> | >>>> 2014-08-05 12:30 GMT+03:00 Pranith Kumar Karampuri | >>>> <pkarampu@xxxxxxxxxx <mailto:pkarampu@xxxxxxxxxx>>: | >>>> | >>>> | >>>> On 08/05/2014 02:33 PM, Roman wrote: | >>>>> Waited long enough for now, still different | >>>>> sizes and no logs about healing :( | >>>>> | >>>>> stor1 | >>>>> # file: | >>>>> exports/fast-test/150G/images/127/vm-127-disk-1.qcow2 | >>>>> trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000 | >>>>> trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000 | >>>>> trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921 | >>>>> | >>>>> root@stor1:~# du -sh | >>>>> /exports/fast-test/150G/images/127/ | >>>>> 1.2G /exports/fast-test/150G/images/127/ | >>>>> | >>>>> | >>>>> stor2 | >>>>> # file: | >>>>> exports/fast-test/150G/images/127/vm-127-disk-1.qcow2 | >>>>> trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000 | >>>>> trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000 | >>>>> trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921 | >>>>> | >>>>> | >>>>> root@stor2:~# du -sh | >>>>> /exports/fast-test/150G/images/127/ | >>>>> 1.4G /exports/fast-test/150G/images/127/ | >>>> According to the changelogs, the file doesn't | >>>> need any healing. Could you stop the operations | >>>> on the VMs and take md5sum on both these machines? | >>>> | >>>> Pranith | >>>> | >>>>> | >>>>> | >>>>> | >>>>> | >>>>> 2014-08-05 11:49 GMT+03:00 Pranith Kumar | >>>>> Karampuri <pkarampu@xxxxxxxxxx | >>>>> <mailto:pkarampu@xxxxxxxxxx>>: | >>>>> | >>>>> | >>>>> On 08/05/2014 02:06 PM, Roman wrote: | >>>>>> Well, it seems like it doesn't see the | >>>>>> changes were made to the volume ? I | >>>>>> created two files 200 and 100 MB (from | >>>>>> /dev/zero) after I disconnected the first | >>>>>> brick. Then connected it back and got | >>>>>> these logs: | >>>>>> | >>>>>> [2014-08-05 08:30:37.830150] I | >>>>>> [glusterfsd-mgmt.c:1584:mgmt_getspec_cbk] | >>>>>> 0-glusterfs: No change in volfile, continuing | >>>>>> [2014-08-05 08:30:37.830207] I | >>>>>> [rpc-clnt.c:1676:rpc_clnt_reconfig] | >>>>>> 0-HA-fast-150G-PVE1-client-0: changing | >>>>>> port to 49153 (from 0) | >>>>>> [2014-08-05 08:30:37.830239] W | >>>>>> [socket.c:514:__socket_rwv] | >>>>>> 0-HA-fast-150G-PVE1-client-0: readv | >>>>>> failed (No data available) | >>>>>> [2014-08-05 08:30:37.831024] I | >>>>>> [client-handshake.c:1659:select_server_supported_programs] | >>>>>> 0-HA-fast-150G-PVE1-client-0: Using | >>>>>> Program GlusterFS 3.3, Num (1298437), | >>>>>> Version (330) | >>>>>> [2014-08-05 08:30:37.831375] I | >>>>>> [client-handshake.c:1456:client_setvolume_cbk] | >>>>>> 0-HA-fast-150G-PVE1-client-0: Connected | >>>>>> to 10.250.0.1:49153 | >>>>>> <http://10.250.0.1:49153>, attached to | >>>>>> remote volume '/exports/fast-test/150G'. | >>>>>> [2014-08-05 08:30:37.831394] I | >>>>>> [client-handshake.c:1468:client_setvolume_cbk] | >>>>>> 0-HA-fast-150G-PVE1-client-0: Server and | >>>>>> Client lk-version numbers are not same, | >>>>>> reopening the fds | >>>>>> [2014-08-05 08:30:37.831566] I | >>>>>> [client-handshake.c:450:client_set_lk_version_cbk] | >>>>>> 0-HA-fast-150G-PVE1-client-0: Server lk | >>>>>> version = 1 | >>>>>> | >>>>>> | >>>>>> [2014-08-05 08:30:37.830150] I | >>>>>> [glusterfsd-mgmt.c:1584:mgmt_getspec_cbk] | >>>>>> 0-glusterfs: No change in volfile, continuing | >>>>>> this line seems weird to me tbh. | >>>>>> I do not see any traffic on switch | >>>>>> interfaces between gluster servers, which | >>>>>> means, there is no syncing between them. | >>>>>> I tried to ls -l the files on the client | >>>>>> and servers to trigger the healing, but | >>>>>> seems like no success. Should I wait more? | >>>>> Yes, it should take around 10-15 minutes. | >>>>> Could you provide 'getfattr -d -m. -e hex | >>>>> <file-on-brick>' on both the bricks. | >>>>> | >>>>> Pranith | >>>>> | >>>>>> | >>>>>> | >>>>>> 2014-08-05 11:25 GMT+03:00 Pranith Kumar | >>>>>> Karampuri <pkarampu@xxxxxxxxxx | >>>>>> <mailto:pkarampu@xxxxxxxxxx>>: | >>>>>> | >>>>>> | >>>>>> On 08/05/2014 01:10 PM, Roman wrote: | >>>>>>> Ahha! For some reason I was not able | >>>>>>> to start the VM anymore, Proxmox VE | >>>>>>> told me, that it is not able to read | >>>>>>> the qcow2 header due to permission | >>>>>>> is denied for some reason. So I just | >>>>>>> deleted that file and created a new | >>>>>>> VM. And the nex message I've got was | >>>>>>> this: | >>>>>> Seems like these are the messages | >>>>>> where you took down the bricks before | >>>>>> self-heal. Could you restart the run | >>>>>> waiting for self-heals to complete | >>>>>> before taking down the next brick? | >>>>>> | >>>>>> Pranith | >>>>>> | >>>>>>> | >>>>>>> | >>>>>>> [2014-08-05 07:31:25.663412] E | >>>>>>> [afr-self-heal-common.c:197:afr_sh_print_split_brain_log] | >>>>>>> 0-HA-fast-150G-PVE1-replicate-0: | >>>>>>> Unable to self-heal contents of | >>>>>>> '/images/124/vm-124-disk-1.qcow2' | >>>>>>> (possible split-brain). Please | >>>>>>> delete the file from all but the | >>>>>>> preferred subvolume.- Pending | >>>>>>> matrix: [ [ 0 60 ] [ 11 0 ] ] | >>>>>>> [2014-08-05 07:31:25.663955] E | >>>>>>> [afr-self-heal-common.c:2262:afr_self_heal_completion_cbk] | >>>>>>> 0-HA-fast-150G-PVE1-replicate-0: | >>>>>>> background data self-heal failed on | >>>>>>> /images/124/vm-124-disk-1.qcow2 | >>>>>>> | >>>>>>> | >>>>>>> | >>>>>>> 2014-08-05 10:13 GMT+03:00 Pranith | >>>>>>> Kumar Karampuri <pkarampu@xxxxxxxxxx | >>>>>>> <mailto:pkarampu@xxxxxxxxxx>>: | >>>>>>> | >>>>>>> I just responded to your earlier | >>>>>>> mail about how the log looks. | >>>>>>> The log comes on the mount's logfile | >>>>>>> | >>>>>>> Pranith | >>>>>>> | >>>>>>> On 08/05/2014 12:41 PM, Roman wrote: | >>>>>>>> Ok, so I've waited enough, I | >>>>>>>> think. Had no any traffic on | >>>>>>>> switch ports between servers. | >>>>>>>> Could not find any suitable log | >>>>>>>> message about completed | >>>>>>>> self-heal (waited about 30 | >>>>>>>> minutes). Plugged out the other | >>>>>>>> server's UTP cable this time | >>>>>>>> and got in the same situation: | >>>>>>>> root@gluster-test1:~# cat | >>>>>>>> /var/log/dmesg | >>>>>>>> -bash: /bin/cat: Input/output error | >>>>>>>> | >>>>>>>> brick logs: | >>>>>>>> [2014-08-05 07:09:03.005474] I | >>>>>>>> [server.c:762:server_rpc_notify] | >>>>>>>> 0-HA-fast-150G-PVE1-server: | >>>>>>>> disconnecting connectionfrom | >>>>>>>> pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0 | >>>>>>>> [2014-08-05 07:09:03.005530] I | >>>>>>>> [server-helpers.c:729:server_connection_put] | >>>>>>>> 0-HA-fast-150G-PVE1-server: | >>>>>>>> Shutting down connection | >>>>>>>> pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0 | >>>>>>>> [2014-08-05 07:09:03.005560] I | >>>>>>>> [server-helpers.c:463:do_fd_cleanup] | >>>>>>>> 0-HA-fast-150G-PVE1-server: fd | >>>>>>>> cleanup on | >>>>>>>> /images/124/vm-124-disk-1.qcow2 | >>>>>>>> [2014-08-05 07:09:03.005797] I | >>>>>>>> [server-helpers.c:617:server_connection_destroy] | >>>>>>>> 0-HA-fast-150G-PVE1-server: | >>>>>>>> destroyed connection of | >>>>>>>> pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0 | >>>>>>>> | >>>>>>>> | >>>>>>>> | >>>>>>>> | >>>>>>>> | >>>>>>>> 2014-08-05 9:53 GMT+03:00 | >>>>>>>> Pranith Kumar Karampuri | >>>>>>>> <pkarampu@xxxxxxxxxx | >>>>>>>> <mailto:pkarampu@xxxxxxxxxx>>: | >>>>>>>> | >>>>>>>> Do you think it is possible | >>>>>>>> for you to do these tests | >>>>>>>> on the latest version | >>>>>>>> 3.5.2? 'gluster volume heal | >>>>>>>> <volname> info' would give | >>>>>>>> you that information in | >>>>>>>> versions > 3.5.1. | >>>>>>>> Otherwise you will have to | >>>>>>>> check it from either the | >>>>>>>> logs, there will be | >>>>>>>> self-heal completed message | >>>>>>>> on the mount logs (or) by | >>>>>>>> observing 'getfattr -d -m. | >>>>>>>> -e hex <image-file-on-bricks>' | >>>>>>>> | >>>>>>>> Pranith | >>>>>>>> | >>>>>>>> | >>>>>>>> On 08/05/2014 12:09 PM, | >>>>>>>> Roman wrote: | >>>>>>>>> Ok, I understand. I will | >>>>>>>>> try this shortly. | >>>>>>>>> How can I be sure, that | >>>>>>>>> healing process is done, | >>>>>>>>> if I am not able to see | >>>>>>>>> its status? | >>>>>>>>> | >>>>>>>>> | >>>>>>>>> 2014-08-05 9:30 GMT+03:00 | >>>>>>>>> Pranith Kumar Karampuri | >>>>>>>>> <pkarampu@xxxxxxxxxx | >>>>>>>>> <mailto:pkarampu@xxxxxxxxxx>>: | >>>>>>>>> | >>>>>>>>> Mounts will do the | >>>>>>>>> healing, not the | >>>>>>>>> self-heal-daemon. The | >>>>>>>>> problem I feel is that | >>>>>>>>> whichever process does | >>>>>>>>> the healing has the | >>>>>>>>> latest information | >>>>>>>>> about the good bricks | >>>>>>>>> in this usecase. Since | >>>>>>>>> for VM usecase, mounts | >>>>>>>>> should have the latest | >>>>>>>>> information, we should | >>>>>>>>> let the mounts do the | >>>>>>>>> healing. If the mount | >>>>>>>>> accesses the VM image | >>>>>>>>> either by someone | >>>>>>>>> doing operations | >>>>>>>>> inside the VM or | >>>>>>>>> explicit stat on the | >>>>>>>>> file it should do the | >>>>>>>>> healing. | >>>>>>>>> | >>>>>>>>> Pranith. | >>>>>>>>> | >>>>>>>>> | >>>>>>>>> On 08/05/2014 10:39 | >>>>>>>>> AM, Roman wrote: | >>>>>>>>>> Hmmm, you told me to | >>>>>>>>>> turn it off. Did I | >>>>>>>>>> understood something | >>>>>>>>>> wrong? After I issued | >>>>>>>>>> the command you've | >>>>>>>>>> sent me, I was not | >>>>>>>>>> able to watch the | >>>>>>>>>> healing process, it | >>>>>>>>>> said, it won't be | >>>>>>>>>> healed, becouse its | >>>>>>>>>> turned off. | >>>>>>>>>> | >>>>>>>>>> | >>>>>>>>>> 2014-08-05 5:39 | >>>>>>>>>> GMT+03:00 Pranith | >>>>>>>>>> Kumar Karampuri | >>>>>>>>>> <pkarampu@xxxxxxxxxx | >>>>>>>>>> <mailto:pkarampu@xxxxxxxxxx>>: | >>>>>>>>>> | >>>>>>>>>> You didn't | >>>>>>>>>> mention anything | >>>>>>>>>> about | >>>>>>>>>> self-healing. Did | >>>>>>>>>> you wait until | >>>>>>>>>> the self-heal is | >>>>>>>>>> complete? | >>>>>>>>>> | >>>>>>>>>> Pranith | >>>>>>>>>> | >>>>>>>>>> On 08/04/2014 | >>>>>>>>>> 05:49 PM, Roman | >>>>>>>>>> wrote: | >>>>>>>>>>> Hi! | >>>>>>>>>>> Result is pretty | >>>>>>>>>>> same. I set the | >>>>>>>>>>> switch port down | >>>>>>>>>>> for 1st server, | >>>>>>>>>>> it was ok. Then | >>>>>>>>>>> set it up back | >>>>>>>>>>> and set other | >>>>>>>>>>> server's port | >>>>>>>>>>> off. and it | >>>>>>>>>>> triggered IO | >>>>>>>>>>> error on two | >>>>>>>>>>> virtual | >>>>>>>>>>> machines: one | >>>>>>>>>>> with local root | >>>>>>>>>>> FS but network | >>>>>>>>>>> mounted storage. | >>>>>>>>>>> and other with | >>>>>>>>>>> network root FS. | >>>>>>>>>>> 1st gave an | >>>>>>>>>>> error on copying | >>>>>>>>>>> to or from the | >>>>>>>>>>> mounted network | >>>>>>>>>>> disk, other just | >>>>>>>>>>> gave me an error | >>>>>>>>>>> for even reading | >>>>>>>>>>> log.files. | >>>>>>>>>>> | >>>>>>>>>>> cat: | >>>>>>>>>>> /var/log/alternatives.log: | >>>>>>>>>>> Input/output error | >>>>>>>>>>> then I reset the | >>>>>>>>>>> kvm VM and it | >>>>>>>>>>> said me, there | >>>>>>>>>>> is no boot | >>>>>>>>>>> device. Next I | >>>>>>>>>>> virtually | >>>>>>>>>>> powered it off | >>>>>>>>>>> and then back on | >>>>>>>>>>> and it has booted. | >>>>>>>>>>> | >>>>>>>>>>> By the way, did | >>>>>>>>>>> I have to | >>>>>>>>>>> start/stop volume? | >>>>>>>>>>> | >>>>>>>>>>> >> Could you do | >>>>>>>>>>> the following | >>>>>>>>>>> and test it again? | >>>>>>>>>>> >> gluster volume | >>>>>>>>>>> set <volname> | >>>>>>>>>>> cluster.self-heal-daemon | >>>>>>>>>>> off | >>>>>>>>>>> | >>>>>>>>>>> >>Pranith | >>>>>>>>>>> | >>>>>>>>>>> | >>>>>>>>>>> | >>>>>>>>>>> | >>>>>>>>>>> 2014-08-04 14:10 | >>>>>>>>>>> GMT+03:00 | >>>>>>>>>>> Pranith Kumar | >>>>>>>>>>> Karampuri | >>>>>>>>>>> <pkarampu@xxxxxxxxxx | >>>>>>>>>>> <mailto:pkarampu@xxxxxxxxxx>>: | >>>>>>>>>>> | >>>>>>>>>>> | >>>>>>>>>>> On | >>>>>>>>>>> 08/04/2014 | >>>>>>>>>>> 03:33 PM, | >>>>>>>>>>> Roman wrote: | >>>>>>>>>>>> Hello! | >>>>>>>>>>>> | >>>>>>>>>>>> Facing the | >>>>>>>>>>>> same | >>>>>>>>>>>> problem as | >>>>>>>>>>>> mentioned | >>>>>>>>>>>> here: | >>>>>>>>>>>> | >>>>>>>>>>>> http://supercolony.gluster.org/pipermail/gluster-users/2014-April/039959.html | >>>>>>>>>>>> | >>>>>>>>>>>> my set up | >>>>>>>>>>>> is up and | >>>>>>>>>>>> running, so | >>>>>>>>>>>> i'm ready | >>>>>>>>>>>> to help you | >>>>>>>>>>>> back with | >>>>>>>>>>>> feedback. | >>>>>>>>>>>> | >>>>>>>>>>>> setup: | >>>>>>>>>>>> proxmox | >>>>>>>>>>>> server as | >>>>>>>>>>>> client | >>>>>>>>>>>> 2 gluster | >>>>>>>>>>>> physical | >>>>>>>>>>>> servers | >>>>>>>>>>>> | >>>>>>>>>>>> server side | >>>>>>>>>>>> and client | >>>>>>>>>>>> side both | >>>>>>>>>>>> running atm | >>>>>>>>>>>> 3.4.4 | >>>>>>>>>>>> glusterfs | >>>>>>>>>>>> from | >>>>>>>>>>>> gluster repo. | >>>>>>>>>>>> | >>>>>>>>>>>> the problem is: | >>>>>>>>>>>> | >>>>>>>>>>>> 1. craeted | >>>>>>>>>>>> replica bricks. | >>>>>>>>>>>> 2. mounted | >>>>>>>>>>>> in proxmox | >>>>>>>>>>>> (tried both | >>>>>>>>>>>> promox | >>>>>>>>>>>> ways: via | >>>>>>>>>>>> GUI and | >>>>>>>>>>>> fstab (with | >>>>>>>>>>>> backup | >>>>>>>>>>>> volume | >>>>>>>>>>>> line), btw | >>>>>>>>>>>> while | >>>>>>>>>>>> mounting | >>>>>>>>>>>> via fstab | >>>>>>>>>>>> I'm unable | >>>>>>>>>>>> to launch a | >>>>>>>>>>>> VM without | >>>>>>>>>>>> cache, | >>>>>>>>>>>> meanwhile | >>>>>>>>>>>> direct-io-mode | >>>>>>>>>>>> is enabled | >>>>>>>>>>>> in fstab line) | >>>>>>>>>>>> 3. installed VM | >>>>>>>>>>>> 4. bring | >>>>>>>>>>>> one volume | >>>>>>>>>>>> down - ok | >>>>>>>>>>>> 5. bringing | >>>>>>>>>>>> up, waiting | >>>>>>>>>>>> for sync is | >>>>>>>>>>>> done. | >>>>>>>>>>>> 6. bring | >>>>>>>>>>>> other | >>>>>>>>>>>> volume down | >>>>>>>>>>>> - getting | >>>>>>>>>>>> IO errors | >>>>>>>>>>>> on VM guest | >>>>>>>>>>>> and not | >>>>>>>>>>>> able to | >>>>>>>>>>>> restore the | >>>>>>>>>>>> VM after I | >>>>>>>>>>>> reset the | >>>>>>>>>>>> VM via | >>>>>>>>>>>> host. It | >>>>>>>>>>>> says (no | >>>>>>>>>>>> bootable | >>>>>>>>>>>> media). | >>>>>>>>>>>> After I | >>>>>>>>>>>> shut it | >>>>>>>>>>>> down | >>>>>>>>>>>> (forced) | >>>>>>>>>>>> and bring | >>>>>>>>>>>> back up, it | >>>>>>>>>>>> boots. | >>>>>>>>>>> Could you do | >>>>>>>>>>> the | >>>>>>>>>>> following | >>>>>>>>>>> and test it | >>>>>>>>>>> again? | >>>>>>>>>>> gluster | >>>>>>>>>>> volume set | >>>>>>>>>>> <volname> | >>>>>>>>>>> cluster.self-heal-daemon | >>>>>>>>>>> off | >>>>>>>>>>> | >>>>>>>>>>> Pranith | >>>>>>>>>>>> | >>>>>>>>>>>> Need help. | >>>>>>>>>>>> Tried | >>>>>>>>>>>> 3.4.3, 3.4.4. | >>>>>>>>>>>> Still | >>>>>>>>>>>> missing | >>>>>>>>>>>> pkg-s for | >>>>>>>>>>>> 3.4.5 for | >>>>>>>>>>>> debian and | >>>>>>>>>>>> 3.5.2 | >>>>>>>>>>>> (3.5.1 | >>>>>>>>>>>> always | >>>>>>>>>>>> gives a | >>>>>>>>>>>> healing | >>>>>>>>>>>> error for | >>>>>>>>>>>> some reason) | >>>>>>>>>>>> | >>>>>>>>>>>> -- | >>>>>>>>>>>> Best regards, | >>>>>>>>>>>> Roman. | >>>>>>>>>>>> | >>>>>>>>>>>> | >>>>>>>>>>>> _______________________________________________ | >>>>>>>>>>>> Gluster-users | >>>>>>>>>>>> mailing list | >>>>>>>>>>>> Gluster-users@xxxxxxxxxxx | >>>>>>>>>>>> <mailto:Gluster-users@xxxxxxxxxxx> | >>>>>>>>>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users | >>>>>>>>>>> | >>>>>>>>>>> | >>>>>>>>>>> | >>>>>>>>>>> | >>>>>>>>>>> -- | >>>>>>>>>>> Best regards, | >>>>>>>>>>> Roman. | >>>>>>>>>> | >>>>>>>>>> | >>>>>>>>>> | >>>>>>>>>> | >>>>>>>>>> -- | >>>>>>>>>> Best regards, | >>>>>>>>>> Roman. | >>>>>>>>> | >>>>>>>>> | >>>>>>>>> | >>>>>>>>> | >>>>>>>>> -- | >>>>>>>>> Best regards, | >>>>>>>>> Roman. | >>>>>>>> | >>>>>>>> | >>>>>>>> | >>>>>>>> | >>>>>>>> -- | >>>>>>>> Best regards, | >>>>>>>> Roman. | >>>>>>> | >>>>>>> | >>>>>>> | >>>>>>> | >>>>>>> -- | >>>>>>> Best regards, | >>>>>>> Roman. | >>>>>> | >>>>>> | >>>>>> | >>>>>> | >>>>>> -- | >>>>>> Best regards, | >>>>>> Roman. | >>>>> | >>>>> | >>>>> | >>>>> | >>>>> -- | >>>>> Best regards, | >>>>> Roman. | >>>> | >>>> | >>>> | >>>> | >>>> -- | >>>> Best regards, | >>>> Roman. | >>> | >>> | >>> | >>> | >>> -- | >>> Best regards, | >>> Roman. | >> | >> | >> | >> | >> -- | >> Best regards, | >> Roman. | >> | >> | >> | >> | >> -- | >> Best regards, | >> Roman. | > | > | > | > | > -- | > Best regards, | > Roman. | | _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users