On 05/18/2011 11:09 AM, Burnash, James wrote: > Based on my experiences so far, I would absolutely agree with you. > > I know new releases are hard to produce at 100% coming out of the > gate, so the fact that 3.2 is not all that robust is unsurprising to > me. Hopefully the point releases improve this. > > I would really like to see the known bugs in 3.1.4 fixed in a later > point release in that branch. > > I do agree slightly with Stephan (which is unusual) that features > over stability seems to be the current direction for the project. The > shame of that is that stability has got to be the #1 priority for any > features to be useful. That said, 3.1.3 does seem pretty solid to me > now, compared with 3.1.1 and with 3.0.4. > > I disagree with Stephan about everything after 2.0.9 being "bogus". > Just because the development direction does not correspond with a > single individuals needs doesn't mean it's worthless or "totally > broken" for others. The additional features for non-downtime changes > to the storage nodes are very useful to us here - though more robust > behavior and better documentation would be welcome. As the leader for a project based on GlusterFS, I'm also very sensitive to the stability issue. It is a bit disappointing when every major release seems to be marked by significant regressions in existing functionality. It's particular worrying when even community leaders and distro packagers report serious problems and those take a long time to resolve. I'd put you in that category, James, along with JoeJulian and daMaestro just with respect to the 3.1.4 and 3.2 releases. Free or paid, that's not a nice thing to do to your marquee users, and you're the kind of people whose interest and support we can hardly afford to lose. Even I've been forced to take a second look at alternatives, and I'm just about the biggest booster Gluster has who's not on their payroll. So how do we deal with these issues *constructively*? Not by characterizing every release since 2.0.9 as "bogus" that's for sure. Sorry, Stephan, but that's absurd. The 3.1+ versions are not only more manageable and add the non-disruptive configuration changes that James mentions (*very* hard to do BTW), but there have also been significant fixes to just about every area of the code. I see 284 high-severity bugs whose resolution went to FIXED in the past year. Granted, some of those were *introduced* in the past year, but the majority can easily be seen to have existed in 2.0.9 or its predecessors and there are many more that wouldn't have shown up on that search due to inconsistent use of the bug system (I'll get to that in a moment). That's a lot of bugs that definitely did affect someone, even if it wasn't you, especially since storage and distributed systems are not the easiest programming domains to work in. Watch the bug list or the patch stream closely, as I do, and you'll see a constant stream of fixes to bugs that have clearly been latent since 2.0.9 or earlier. The problem I do see, and I do agree with others who've spoken out here, is primarily one of communication. It's a bit frustrating to see dozens of geosync/marker/quota patches fly by while a report of a serious bug isn't even *assigned* (let alone worked on as far as anyone can tell) for days or even weeks. I can only imagine how it must be for the people whose filesystems have been totally down for that long, whose bosses are breathing down their necks and pointedly suggesting that a technology switch might be in order. We can all help by making sure our bugs are actually filed on bugs.gluster.com - not just mentioned here or on IRC - and by doing our part to provide the developers with the information they need to reproduce or fix problems. We can help by actually testing pre-release versions, particularly if our configurations/workloads are likely to represent known gaps in Gluster's own test coverage. The devs can help by marking bugs' status/assignment, severity/priority, and found/fixed versions more consistently. The regression patterns in the last few releases clearly indicate that more tests are needed in certain areas such as RDMA and upgrades with existing data. The key here is that if we want things to change we all need to make it happen. We can't tell Gluster how to run their business, which includes how they decide on features or how they allocate resources to new features vs. bug fixes, but as a community we can give them clear and unambiguous information about what is holding back more widespread adoption. It used to be manageability; now it's bugs. We need to be as specific as we possibly can about which bugs or shortcomings matter to us, not just vague "it doesn't work" or "it's slow" or "it's not POSIX enough" kinds of stuff, so that a concrete plan can be made to improve the situation.