gluster 3.2.0 - totally broken?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/18/2011 11:09 AM, Burnash, James wrote:
> Based on my experiences so far, I would absolutely agree with you.
> 
> I know new releases are hard to produce at 100% coming out of the
> gate, so the fact that 3.2 is not all that robust is unsurprising to
> me. Hopefully the point releases improve this.
> 
> I would really like to see the known bugs in 3.1.4 fixed in a later
> point release in that branch.
> 
> I do agree slightly with Stephan (which is unusual) that features
> over stability seems to be the current direction for the project. The
> shame of that is that stability has got to be the #1 priority for any
> features to be useful. That said, 3.1.3 does seem pretty solid to me
> now, compared with 3.1.1 and with 3.0.4.
> 
> I disagree with Stephan about everything after 2.0.9 being "bogus".
> Just because the development direction does not correspond with a
> single individuals needs doesn't mean it's worthless or "totally
> broken" for others. The additional features for non-downtime changes
> to the storage nodes are very useful to us here - though more robust
> behavior and better documentation would be welcome.

As the leader for a project based on GlusterFS, I'm also very sensitive
to the stability issue. It is a bit disappointing when every major
release seems to be marked by significant regressions in existing
functionality. It's particular worrying when even community leaders and
distro packagers report serious problems and those take a long time to
resolve. I'd put you in that category, James, along with JoeJulian and
daMaestro just with respect to the 3.1.4 and 3.2 releases. Free or paid,
that's not a nice thing to do to your marquee users, and you're the kind
of people whose interest and support we can hardly afford to lose.  Even
I've been forced to take a second look at alternatives, and I'm just
about the biggest booster Gluster has who's not on their payroll.

So how do we deal with these issues *constructively*? Not by
characterizing every release since 2.0.9 as "bogus" that's for sure.
Sorry, Stephan, but that's absurd. The 3.1+ versions are not only more
manageable and add the non-disruptive configuration changes that James
mentions (*very* hard to do BTW), but there have also been significant
fixes to just about every area of the code. I see 284 high-severity bugs
whose resolution went to FIXED in the past year. Granted, some of those
were *introduced* in the past year, but the majority can easily be seen
to have existed in 2.0.9 or its predecessors and there are many more
that wouldn't have shown up on that search due to inconsistent use of
the bug system (I'll get to that in a moment). That's a lot of bugs that
definitely did affect someone, even if it wasn't you, especially since
storage and distributed systems are not the easiest programming domains
to work in. Watch the bug list or the patch stream closely, as I do, and
you'll see a constant stream of fixes to bugs that have clearly been
latent since 2.0.9 or earlier.

The problem I do see, and I do agree with others who've spoken out here,
is primarily one of communication. It's a bit frustrating to see dozens
of geosync/marker/quota patches fly by while a report of a serious bug
isn't even *assigned* (let alone worked on as far as anyone can tell)
for days or even weeks. I can only imagine how it must be for the people
whose filesystems have been totally down for that long, whose bosses are
breathing down their necks and pointedly suggesting that a technology
switch might be in order. We can all help by making sure our bugs are
actually filed on bugs.gluster.com - not just mentioned here or on IRC -
and by doing our part to provide the developers with the information
they need to reproduce or fix problems. We can help by actually testing
pre-release versions, particularly if our configurations/workloads are
likely to represent known gaps in Gluster's own test coverage. The devs
can help by marking bugs' status/assignment, severity/priority, and
found/fixed versions more consistently. The regression patterns in the
last few releases clearly indicate that more tests are needed in certain
areas such as RDMA and upgrades with existing data.

The key here is that if we want things to change we all need to make it
happen. We can't tell Gluster how to run their business, which includes
how they decide on features or how they allocate resources to new
features vs. bug fixes, but as a community we can give them clear and
unambiguous information about what is holding back more widespread
adoption. It used to be manageability; now it's bugs. We need to be as
specific as we possibly can about which bugs or shortcomings matter to
us, not just vague "it doesn't work" or "it's slow" or "it's not POSIX
enough" kinds of stuff, so that a concrete plan can be made to improve
the situation.


[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux