Installed 3.4.3 exactly 2 weeks ago on all our brick servers and I'm happy to report that we've not had a crash since. Thanks for all the good work. On Tue, 2014-04-15 at 14:22 +0800, Franco Broi wrote: > The whole system came to a grinding halt today and no amount of > restarting daemons would make it work again. What was really odd was > that gluster vol status said everything was fine and yet all the client > mount points had hung. > > On the node that was exporting Gluster NFS I had zombie processes so I > decided to reboot, took a while for the ZFS JBOD's to sort themselves > out but I was relieved when it all came back up - except that the df > size on the clients was wrong... > > gluster vol info and gluster vol status said everything was fine but it > was obvious that 2 of my bricks were missing. I restarted everything, > and still 2 missing brick. I remounted the fuse clients and still no > good. > > Just out of sheer desperation and for no good reason I disabled the > Gluster NFS export and magically the missing 2 bricks reappeared and the > filesystem was back to its normal size. I turned NFS exports back on and > everything stayed working. > > I'm not trying to belittle all the good work done by the Gluster > developers but this really doesn't look like a viable big data > filesystem at the moment. We've currently got 800TB and are about to add > another 400TB but quite honestly the prospect terrifies me. > > > On Tue, 2014-04-15 at 08:35 +0800, Franco Broi wrote: > > On Mon, 2014-04-14 at 17:29 -0700, Harshavardhana wrote: > > > > > > > > Just distributed. > > > > > > > > > > Pure distributed setup you have to take a downtime, since the data > > > isn't replicated. > > > > If I shutdown the server processes, wont the clients just wait for it to > > come back up? Ie like NFS hard mounts? I don't mind an interruption, I > > just want to avoid killing all jobs that are currently accessing the > > filesystem if at all possible, our users have suffered a lot recently > > with filesystem outages. > > > > By the way, how does one shutdown the glusterfs processes without > > stopping a volume? It would be nice to have a quiesce or freeze option > > that just stalls all access while maintenance takes place. > > > > > > > > >> > > > >> > 3.4.1 to 3.4.3-3 shouldn't cause problems with existing clients and > > > >> > other servers, right? > > > >> > > > > >> > > > >> You mean 3.4.1 and 3.4.3 co-existent with in a cluster? > > > > > > > > Yes, at least for the duration of the upgrade. > > > > > > Yeah 3.4.x series is backward compatible to each other in any case. > > > > > > > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users@xxxxxxxxxxx > > http://supercolony.gluster.org/mailman/listinfo/gluster-users > > > _______________________________________________ > Gluster-users mailing list > Gluster-users@xxxxxxxxxxx > http://supercolony.gluster.org/mailman/listinfo/gluster-users _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users