Re: Empty info file preventing glusterd from starting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Tue, May 9, 2017 at 6:10 PM, ABHISHEK PALIWAL <abhishpaliwal@xxxxxxxxx> wrote:
Hi Atin,

Thanks for your reply.


Its urgent because this error is very rarely reproducible we have seen this 2 3 times in our system till now.

We have delivery in near future so that we want it asap. Please try to review it internally.

I don't think your statements justified the reason of urgency as (a) you have mentioned it to be *rarely* reproducible and (b) I am still waiting for a real use case where glusterd will go through multiple restarts in a loop?


Regards,
Abhishek

On Tue, May 9, 2017 at 5:58 PM, Atin Mukherjee <amukherj@xxxxxxxxxx> wrote:


On Tue, May 9, 2017 at 3:37 PM, ABHISHEK PALIWAL <abhishpaliwal@xxxxxxxxx> wrote:
+ Muthu-vingeshwaran

On Tue, May 9, 2017 at 11:30 AM, ABHISHEK PALIWAL <abhishpaliwal@xxxxxxxxx> wrote:
Hi Atin/Team,

We are using gluster-3.7.6 with setup of two brick and while restart of system I have seen that the glusterd daemon is getting failed from start.


At the time of analyzing the logs from etc-glusterfs.......log file I have received the below logs


[2017-05-06 03:33:39.798087] I [MSGID: 100030] [glusterfsd.c:2348:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.6 (args: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO)
[2017-05-06 03:33:39.807859] I [MSGID: 106478] [glusterd.c:1350:init] 0-management: Maximum allowed open file descriptors set to 65536
[2017-05-06 03:33:39.807974] I [MSGID: 106479] [glusterd.c:1399:init] 0-management: Using /system/glusterd as working directory
[2017-05-06 03:33:39.826833] I [MSGID: 106513] [glusterd-store.c:2047:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 30706
[2017-05-06 03:33:39.827515] E [MSGID: 106206] [glusterd-store.c:2562:glusterd_store_update_volinfo] 0-management: Failed to get next store iter
[2017-05-06 03:33:39.827563] E [MSGID: 106207] [glusterd-store.c:2844:glusterd_store_retrieve_volume] 0-management: Failed to update volinfo for c_glusterfs volume
[2017-05-06 03:33:39.827625] E [MSGID: 106201] [glusterd-store.c:3042:glusterd_store_retrieve_volumes] 0-management: Unable to restore volume: c_glusterfs
[2017-05-06 03:33:39.827722] E [MSGID: 101019] [xlator.c:428:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again
[2017-05-06 03:33:39.827762] E [graph.c:322:glusterfs_graph_init] 0-management: initializing translator failed
[2017-05-06 03:33:39.827784] E [graph.c:661:glusterfs_graph_activate] 0-graph: init failed
[2017-05-06 03:33:39.828396] W [glusterfsd.c:1238:cleanup_and_exit] (-->/usr/sbin/glusterd(glusterfs_volumes_init-0x1b0b8) [0x1000a648] -->/usr/sbin/glusterd(glusterfs_process_volfp-0x1b210) [0x1000a4d8] -->/usr/sbin/glusterd(cleanup_and_exit-0x1beac) [0x100097ac] ) 0-: received signum (0), shutting down

Abhishek,

This patch needs to be thoroughly reviewed to ensure that it doesn't cause any regression given this touches on the core store management functionality of glusterd. AFAICT, we get into an empty info file only when volume set operation is executed and in parallel one of the glusterd instance in other nodes have been brought down and whole sequence of operation happens in a loop. The test case through which you can get into this situation is not something you'd hit in production. Please help me to understand the urgency here.

Also in one of the earlier thread, I did mention the workaround of this issue back to Xin through http://lists.gluster.org/pipermail/gluster-users/2017-January/029600.html
"If you end up in having a 0 byte info file you'd need to copy the same info file from other node and put it there and restart glusterd"



I have found one of the existing case is there and also solution patch is available but the status of that patch in "cannot merge". Also the "info" file is empty and "info.tmp" file present in "lib/glusterd/vol" directory.

Below is the link of the existing case.

https://review.gluster.org/#/c/16279/5

please let me know what is the plan of community to provide the solution of this problem and in which version.

Regards
Abhishek Paliwal



--




Regards
Abhishek Paliwal




--




Regards
Abhishek Paliwal

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux