Re: Performance Translators' Stability and Usefulness

Geoff Kassel <gkassel@xxxxxxxxxxxxxxxxxxxxx> · Sun, 5 Jul 2009 20:19:01 +1000

Hi Shehjar,

> I think the QA folks have done some really good work in stabilizing
> GlusterFS over the last year or so. The result is there to see in the
> 2.0.X releases.

You call allowing two critical data corruption bugs through to the final 
public release in two releases in a row 'good work'?

> Still, I understand your problems are more important to you than the
> problems being faced by other users, I'd so appreciate if you'd give our
> bugzilla-based setup a chance at handling this bug. Or, let me
> know if you've already filed a report.

I take issue with this.

If you've been reading what I've been posting, I reported my issue quite some 
time ago directly to the developers, here in this list and in all the places 
that existed at the time for posting such bug reports.

I've posted logs. I've posted configurations. I've tried to patch it (and a 
number of other issues which would no doubt affect others) myself. I've had 
my patches ignored because your team didn't like the presence of any comments 
in the code.

(Comments that are there so that automated code quality checking tools - which 
you don't use, apparently - don't keep flagging the same issues over and over 
again. For when you might want to check for the presence of any newly 
introduced issues later, as you would in a real QA process.)

Somehow, I don't think a new bugtracker is going to make a difference to the 
general indifference I've been shown with regards to this issue.

> > When will this even show up on the roadmap?
>
> The QA team is already working on just such a testing and
> regression framework.

Your QA process document - written nearly a year ago - claims that you're 
already using these sorts of frameworks. So it was just spin, after all - no 
such framework really exists yet.

So how much of the documentation on the wiki actually reflects reality??

I'm appalled. This is no way to run an open source project - commercial 
entanglements or not.

If you are going to develop an open source product to actually meet your 
stated goals of providing a stable and reliable storage system, you need to 
follow the examples of the other successful projects that have come before 
you, and do at least the following two things.

One is to listen to your community - paying and non-paying - and accept help 
and advice from the community when it's offered. Spend a little effort 
getting offered patches into the shape that's acceptable to you, instead of 
just ignoring them. We've taken the time (at our expense) to help improve the 
product that you're selling commercial support for - it's really the least 
you can do, since you're getting (more or less) expert help for free.

The second is not to lie to, mislead, or conceal information from your 
community. If you post documents stating that you have a QA process that 
involves unit tests, then those tests should exist. Better still - they 
should be publicly available, so the community can use them for their own 
development of new features or bug fixes.

If you can't release them for reasons like (say) the use of proprietary 
software - come out and say so. The worse thing you could do is say you have 
these QA processes, then turn around after nearly a year and a series of 
extremely critical bugs and admit (as you've just admitted) that you don't.

When you're developing a system that people are entrusting with gigabytes, 
terabytes, or even petabytes of their precious data, your users need to feel 
that they can trust the developers designing that software to take all due 
care.

If you get caught out lying about exactly what level of quality control you're 
actually taking (and these things will out with open source projects), people 
will lose trust in you and your project. Which surely can't be good for the 
bottom line of your commercial arm.

I know I've certainly lost a lot of trust in GlusterFS through all this.

Geoff.

On Sun, 5 Jul 2009, Shehjar Tikoo wrote:
> Thanks Geoff.
>
> It is always good to get an external opinion on where we stand.
>
> Geoff Kassel wrote:
> > Hi Shehjar, I feel I should comment on part of your reply to Gordan's
> >  email.
> >
> >>> Finally - which translators are deemed stable (no know issues -
> >>> memory leaks/bloat, crashes, corruption, etc.)?
> >>
> >> We can definitely vouch for a higher degree of stability of the
> >> releases. Otherwise, I dont think there is any performance
> >> translator we can call completely stable/mature because of the
> >> roadmap we have for constantly upgrading algorithms, functionality,
> >>  etc.
> >
> > When will the Gluster team be able to deliver a stable, mature, and
> > reliable version of GlusterFS?
>
> Continuing from what I said earlier, the fact that GlusterFS releases
> work in a stable manner is shown by the deployments among our
> customers.
>
> At the same time, are we satisfied with the experience of non-paying
> users?
> No. I accept there are bottlenecks in our processes. We
> acknowledge that and have been working on fixing them. Most visible
> aspect of that for users is the move to using bugzilla at
> bugs.gluster.com. The earlier setup at Savannah just wasnt scaling.
> Personally, in just a few weeks, I am finding handling bugs through
> this portal much faster and streamlined than earlier.
>
> > I have been using GlusterFS since the v1.3.x days, and I have yet to
> >  see a version since then that doesn't crash at least once a day from
> >  just load on even the simplest configurations.
> >
> > Then there's the data corruption bug of the early 2.0.0 releases,
> > which has kept me (and no doubt others) from upgrading to these
> > releases.
> >
> > I have read about the Gluster QA team, but quite frankly, I have yet
> >  to see the fruits of this team's work. Letting through a bug of that
> >  magnitude in a major release blew a lot of trust I had in the
> > Gluster team's QA process.
> >
> > When will regression tests be used? It's been months now since this
> > bug, and still I don't see any sign of the use of this simple,
> > industry-standard technique to minimise the risk of such issues
> > slipping through again.
>
> I think the QA folks have done some really good work in stabilizing
> GlusterFS over the last year or so. The result is there to see in the
> 2.0.X releases.
>
> > Why wasn't this prioritised after such a disasterous bug?
>
> It could've been for any number of reasons ranging from problems with
> reproducing it, limited functionality for managing bug reports in
> Savannah to even the general constraints of being a commercial
> open-source project.
>
> Still, I understand your problems are more important to you than the
> problems being faced by other users, I'd so appreciate if you'd give our
> bugzilla-based setup a chance at handling this bug. Or, let me
> know if you've already filed a report.
>
> > When will this even show up on the roadmap?
>
> The QA team is already working on just such a testing and
> regression framework.
>
> Thanks
> Shehjar
>
> > Geoff.
> >
> > On Sat, 4 Jul 2009, Shehjar Tikoo wrote:
> >> Gordan Bobic wrote:
> >>> Just reading through the wiki on this and a few things are
> >>> unclear, so I'm hoping someone can clarify.
> >>>
> >>> 1) readahead
> >>>
> >>> - Is there any point in using this on systems where the
> >>> interconnect <= 1Gb/s? The wiki implies there is no point in
> >>> this, but doesn't quite state it explicitly.
> >>
> >> I am pretty sure it helps. The question of using read-ahead is more
> >>  of a question related to the workload rather than the
> >> interconnect, for eg. it'll be useful for sequential reading,
> >> without any doubts. Of course, there can be cases where excessive
> >> read-ahead chokes the 100 Mib/s link, but then read-ahead can be
> >> configured to reduce its utilization of the network by reducing the
> >>  page-count option.
> >>
> >>> - Is there any point in using this on a server that is also it's
> >>>  own client when use with replicate/afr? I'm guessing there isn't
> >>>  since the local fs will be doing it's own read-ahead but I'd
> >>> like some confirmation on that.
> >>
> >> No. Generally, read-ahead will be most beneficial only on the
> >> client side since it helps avoid the need to go to the network when
> >>  an application does need the data already read-ahead. Yes, on the
> >>  server side, on-disk file systems read-ahead already does it best.
> >>
> >>
> >> In your setup above, in case the system has more than a few
> >> CPUs/cores, it might be possible to get a little better performance
> >>  while using io-threads on the client. That'll make it possible to
> >>  offload the read-ahead to an io-thread without blocking the main
> >> glusterfs thread. Then, the benefit of read-ahead + io-threads
> >> might show up when the data is actually needed, and could be served
> >>  without a kernel entry/exit for file system call.
> >>
> >>> 2) io-threads
> >>>
> >>> Is this (usefully) applicable on the client side?
> >>
> >> It is. Using io-threads on the client side helps offload the
> >> processing of individual file operations onto a separate thread,
> >> freeing up the main thread to perform other tasks. This is
> >> especially applicable when using io-threads under a write-behind
> >> and/or read-ahead translators where the write-behind and read-ahead
> >>  requests, i.e. background or asynchronous requests essentially,
> >> can be offloaded to the threads while freeing up the main glusterfs
> >>  thread to handle sync requests, i.e. requests that could make the
> >>  application block on a syscall.
> >>
> >> Also, using io-threads on client side could help in performing
> >> network IO in a separate thread, again freeing up the main thread
> >> for other in-band tasks.
> >>
> >> Then again, if the workload is not concurrent in terms of number of
> >>  processes or number of files/dirs, then io-threads might not help
> >>  much.
> >>
> >>> 3) io-cache
> >>>
> >>> The wiki page has the same paragraph pasted for both io-threads
> >>> and io-cache. Are they the same thing, or is this a documentation
> >>>  bug?
> >>
> >> No, they're not the same. The documentation is still in a flux.
> >> Hope this version will help:
> >> http://www.gluster.org/docs/index.php/Translators_options
> >>
> >>> What does io-cache do?
> >>
> >> io-cache is a translator that caches data from files so that future
> >>  references do not lead to network requests. It is generally used
> >> along with read-ahead so that the data that gets read ahead or any
> >>  data that gets read, for that matter, will be available from the
> >> local client cache. We're also working on incorporating support for
> >>  write buffering in io-cache so that write operations can also
> >> benefit from local buffering until a point in time suitable for
> >> actual transmission to the server.
> >>
> >>> Finally - which translators are deemed stable (no know issues -
> >>> memory leaks/bloat, crashes, corruption, etc.)?
> >>
> >> We can definitely vouch for a higher degree of stability of the
> >> releases. Otherwise, I dont think there is any performance
> >> translator we can call completely stable/mature because of the
> >> roadmap we have for constantly upgrading algorithms, functionality,
> >>  etc.
> >>
> >>> Any particular suggestions on which performance translator
> >>> combination would be good to apply for a shared root AFR over a
> >>> WAN? I already have read-subvolume set to the local mirror, but
> >>> any improvement is welcome when latencies soar to 100ms and b/w
> >>> gets hammered down to 1-2.5 Mb/s.
> >>
> >> WANs are generally characterised as having a large bandwidth-delay
> >> product. That basically means, for good throughput, we should be
> >> pipelining as much data as possible over the link, so that the long
> >>  latency overhead can be mitigated or amortised by sending larger
> >> amount of data for the same fixed overhead.
> >>
> >> That said, what particular workload is it that gives you a
> >> throughput of 1-2.5 Mb/s?
> >>
> >> When you say "latencies soar to 100ms", does that mean, these are
> >> just unusual spikes or is that the normal latency observed?
> >>
> >> It'd help to see your volfiles and how the performance translators
> >>  are arranged.
> >>
> >>> Another thing - when a node works standalone in AFR, performance
> >>>  is pretty good, but as soon as a peer node joins, even though
> >>> the original node is the primary, performance degrades on the
> >>> primary node quite significantly, even though the interconnect is
> >>>  direct gigabit, which shouldn't be adding any particular latency
> >>>  (< 0.1ms) or overheads, especially on the primary node. Is there
> >>>  any particular reason for this degradation? It's OK in normal
> >>> usage, but some operations (e.g. building an big bootstrapping
> >>> initrd (50MB compressed, including all the gernel drivers) takes
> >>>  nearly 10x longer when the peers join than when the node is
> >>> standalone. I expected some degradation, but only on the order of
> >>>  added network latency, and this is way, way more. I tried with
> >>> and without direct-io=off, and that didn't make a great amount of
> >>>  difference. Which performance translators are likely to help
> >>> with this use case?
> >>
> >> I think Vikas will be able to answer that better.
> >>
> >> -Shehjar
> >>
> >>> Gordan
> >>>
> >>>
> >>> _______________________________________________ Gluster-devel
> >>> mailing list Gluster-devel@xxxxxxxxxx
> >>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
> >>
> >> _______________________________________________ Gluster-devel
> >> mailing list Gluster-devel@xxxxxxxxxx
> >> http://lists.nongnu.org/mailman/listinfo/gluster-devel