Re: failing commits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 01/02/17 14:44, Atin Mukherjee wrote:
I think you have hit https://bugzilla.redhat.com/show_bug.cgi?id=1406411 which has been fixed in mainline and will be available in release-3.10 which is slated for next month.

To prove you have hit the same problem can you please confirm the following:

1. Which Gluster version are you running?
2. Was any of the existing brick down?
2. Did you mounted the volume? If not you have two ways (1) bring up the brick and restart glusterd followed by add-brick or (2) if the existing brick(s) is bad for some reason, restarting glusterd and mounting the volume followed by a look up and then attempting add-brick should succeed.


a chance to properly investigate it has been lost I think.
I all started with one peer I missed was not migrated from 3.7 to 3.8 and unfortunately it was a system I could not tamper with until late evening, which is now.
This problem though occurred after I already upgraded that gluster to 3.8. I even removed that failing node's bricks and detached it, re-attached it and still, those errors I described earlier... until now when I restarted that one last one peer... now all seems ok, well, at least I don't see those errors any more.

Should I now be looking at something particular more closely?
b.w.
L.


On Wed, Feb 1, 2017 at 7:49 PM, lejeczek <peljasz@xxxxxxxxxxx> wrote:
hi,

I have a four peers gluster and one is failing, well, kind of..
If on a working peer I do:

$ gluster volume add-brick QEMU-VMs replica 3 10.5.6.49:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-QEMU-VMs force
volume add-brick: failed: Commit failed on whale.priv Please check log file for details.

but:

$ gluster vol info QEMU-VMs
Volume Name: QEMU-VMs
Type: Replicate
Volume ID: 8709782a-daa5-4434-a816-c4e0aef8fef2
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.5.6.100:/__.aLocalStorages/1/0-GLUSTERs/1GLUSTER-QEMU-VMs
Brick2: 10.5.6.17:/__.aLocalStorages/1/0-GLUSTERs/QEMU-VMs
Brick3: 10.5.6.49:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-QEMU-VMs    # <= so it is here, also this command on that failing peers reports correctly.

Interestingly,

$ gluster volume remove-brick

removes no errors, but this change is not propagated to the failing peer. Vol info still reports its brick is part of the volume.

And the failing completely part: every command on failing peer reports:

$ gluster volume remove-brick QEMU-VMs replica 2 10.5.6.49:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-QEMU-VMs force
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
volume remove-brick commit force: failed: Commit failed on 10.5.6.32. Please check log file for details.
Commit failed on rider.priv Please check log file for details.
Commit failed on 10.5.6.17. Please check log file for details.

I've been watching logs but honestly, don't know which one(s) I should paste in here.
b.w.
L.


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users



--

~ Atin (atinm)

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux