My gluster volume is more-or-less useless from an administration point of view. I am unable to stop the volume because it claims it is rebalancing or gluster says the command failed. When I try to stop, start, or get the status of rebalancing, I get nothing returned. I have stopped and restarted all glusterfsd processes on each host. Nothing seems to bring sanity back to the volume. This is bad news for gluster's reliability. I am unable to find a source of the problem. Regular methods for resetting the system to usable state are not working. I think it is time to call it quits and find another solution. Ceph? On Fri, Nov 23, 2012 at 1:21 PM, Jonathan Lefman <jonathan.lefman at essess.com > wrote: > At the same time, when looking at the rebalance log, it appears that the > rebalance is still going on in the background because I am seeing entries > related to rebalancing. However, the detail status command shows that the > distribution for files is still stable on the older nodes. > > > > On Fri, Nov 23, 2012 at 1:10 PM, Jonathan Lefman < > jonathan.lefman at essess.com> wrote: > >> Volume type: >> >> non-replicated, 29 nodes, xfs formats >> >> Number of files/directories: >> >> There are about 5000-10000 directories >> >> Average size of files: >> >> There are two distributions of files: a vast majority of files is around >> 200-300 kilobytes, with about 1000-fold fewer files with a size around 1 >> gigabyte >> >> Average number of files per directory: >> >> Around 1800 files per directory >> >> glusterd log below: >> >> When trying >> >> sudo gluster volume rebalance essess_data status >> >> OR >> >> sudo gluster volume status myvol >> operation failed >> >> Log for this time from /var/log/glusterfs/etc-glusterfs-glusterd.vol.log: >> >> [2012-11-23 13:05:00.489567] E >> [glusterd-handler.c:458:glusterd_op_txn_begin] 0-management: Unable to >> acquire local lock, ret: -1 >> [2012-11-23 13:07:09.102007] I >> [glusterd-handler.c:2670:glusterd_handle_status_volume] 0-management: >> Received status volume req for volume essess_data >> [2012-11-23 13:07:09.102056] E [glusterd-utils.c:277:glusterd_lock] >> 0-glusterd: Unable to get lock for uuid: >> ee33fd05-135e-40e7-a157-3c1e0b9be073, lock held by: >> ee33fd05-135e-40e7-a157-3c1e0b9be073 >> [2012-11-23 13:07:09.102073] E >> [glusterd-handler.c:458:glusterd_op_txn_begin] 0-management: Unable to >> acquire local lock, ret: -1 >> >> >> >> >> On Fri, Nov 23, 2012 at 12:58 PM, Vijay Bellur <vbellur at redhat.com>wrote: >> >>> On 11/23/2012 11:14 PM, Jonathan Lefman wrote: >>> >>>> The rebalance command has run for quite a while. Now when I issue the >>>> rebalance status command, >>>> >>>> sudo gluster volume rebalance myvol status >>>> >>>> I get nothing back; just a return to the command prompt. Any ideas of >>>> what is going on? >>>> >>>> >>> A few questions: >>> >>> - What is your volume type? >>> - How many files and directories do you have in your volume? >>> - What is the average size of files? >>> - What is the average number of files per directory? >>> - Can you please share glusterd logs from the time when the command >>> returns without displaying any output? >>> >>> Thanks, >>> Vijay >>> >>> >> >> >> -- >> *Jonathan Lefman, Ph.D.* >> *?**ssess, Inc.* >> 25 Thomson Place, Suite 460, Boston, MA 02210 >> o: 415-361-5488 x121 | e: jonathan.lefman at essess.com | *www.essess.com* >> >> > > > -- > *Jonathan Lefman, Ph.D.* > *?**ssess, Inc.* > 25 Thomson Place, Suite 460, Boston, MA 02210 > o: 415-361-5488 x121 | e: jonathan.lefman at essess.com | *www.essess.com* > > -- *Jonathan Lefman, Ph.D.* *?**ssess, Inc.* 25 Thomson Place, Suite 460, Boston, MA 02210 o: 415-361-5488 x121 | e: jonathan.lefman at essess.com | *www.essess.com* -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20121124/87279a20/attachment.html>