glusterfs performance issues

skraw at ithnet.com (Stephan von Krawczynski) · Tue, 8 Jan 2013 02:06:51 +0100

On Mon, 07 Jan 2013 13:19:49 -0800
Joe Julian <joe at julianfamily.org> wrote:

> You have a replicated filesystem, brick1 and brick2.
> Brick 2 goes down and you edit a 4k file, appending data to it.
> That change, and the fact that there is a pending change, is stored on 
> brick1.
> Brick2 returns to service.
> Your app wants to append to the file again. It calls stat on the file. 
> Brick2 answers first stating that the file is 4k long. Your app seeks to 
> 4k and writes. Now the data you wrote before is gone.

Forgive my ignorance, but it obvious that this implementation of a stat on a
replicating fs is shit. Of course a stat should await _all_ returning local
stats and should choose the stat of the _latest_ file version and note that
the file needs self heal.

> This is one of the processes by which stale stat data can cause data 
> loss. That's why each lookup() (which precedes the stat) causes a 
> self-heal check and why it's a problem that hasn't been resolved in the 
> last two years.

self-heal is no answer to this question. The only valid answer is choosing the
_latest_ file version no matter if self heal is necessary or not.

> I don't know the answer. I know that they want this problem to be 
> solved, but right now the best solution is hardware. The lower the 
> latency, the less of a problem you'll have.

The only solution is correct programming, no matter what the below hardware
looks like. The only outcome of good or bad hardware is how _fast_ the
_correct_ answer reaches the fs client.

Your description is a satire, not?

> On 01/07/2013 12:59 PM, Dennis Jacobfeuerborn wrote:
> > On 01/07/2013 06:11 PM, Jeff Darcy wrote:
> >> On 01/07/2013 12:03 PM, Dennis Jacobfeuerborn wrote:
> >>> The "gm convert" processes make almost no progress even though on a regular
> >>> filesystem each call takes only a fraction of a second.
> >> Can you run gm_convert under strace?  That will give us a more accurate
> >> idea of what kind of I/O it's generating.  I recommend both -t and -T to
> >> get timing information as well.  Also, it never hurts to file a bug so
> >> we can track/prioritize/etc.  Thanks.
> >>
> >> https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
> > Thanks for the strace hint. As it turned out the gm convert call was issued
> > on the filename with a "[0]" appended which apparently led gm to stat() all
> > (!) files in the directory.
> >
> > While this particular problem isn't really a glusterfs problem is there a
> > way to improve the stat() performance in general?
> >
> > Regards,
> >    Dennis
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://supercolony.gluster.org/mailman/listinfo/gluster-users
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
> 

-- 
MfG,
Stephan