Re: Horrendously slow directory access

John Mark Walker <jowalker@xxxxxxxxxx> · Thu, 10 Apr 2014 07:04:46 -0400 (EDT)

Hi James,

This definitely looks worthy of investigation. Could you file a bug? We need to get our guys on this.

Thanks for doing your homework. Send us the BZ #, and we'll start poking around.

-JM

----- Original Message -----
> Hey Joe!
> 
> Yeah we are all XFS all the time round here - none of that nasty ext4
> combo that we know causes raised levels of mercury :-)
> 
> The brick errors, we have not seen any we have been busy grepping and
> alerting on anything suspect in our logs.  Mind you there are hundreds
> of brick logs to search through I'm not going to say we may have
> missed one, but after asking the boys in chat just now they are pretty
> convinced that was not the smoking gun.  I'm sure they will chip in on
> this thread if there is anything.
> 
> 
> j.
> 
> --
> dr. james cuff, assistant dean for research computing, harvard
> university | division of science | thirty eight oxford street,
> cambridge. ma. 02138 | +1 617 384 7647 | http://rc.fas.harvard.edu
> 
> 
> On Wed, Apr 9, 2014 at 10:36 AM, Joe Julian <joe@xxxxxxxxxxxxxxxx> wrote:
> > What's the backend filesystem?
> > Were there any brick errors, probably around 2014-03-31 22:44:04 (half an
> > hour before the frame timeout)?
> >
> >
> > On April 9, 2014 7:10:58 AM PDT, James Cuff <james_cuff@xxxxxxxxxxx> wrote:
> >>
> >> Hi team,
> >>
> >> I hate "me too" emails sometimes not at all constructive, but I feel I
> >> really ought chip in from real world systems we use in anger and at
> >> massive scale here.
> >>
> >> So we also use NFS to "mask" this and other performance issues.  The
> >> cluster.readdir-optimize gave us similar results unfortunately.
> >>
> >> We reported our other challenge back last summer but we stalled on this:
> >>
> >> http://www.gluster.org/pipermail/gluster-users/2013-June/036252.html
> >>
> >> We also unfortunately now see a new NFS phenotype that I've pasted
> >> below which is again is causing real heartburn.
> >>
> >> Small files, always difficult for any FS, might be worth doing some
> >> regression testing with small file directory scenarios in test - it's
> >> an easy reproducer on even moderately sized gluster clusters.  Hope
> >> some good progress can be
> >> made, and I understand it's a tough one to
> >> track down performance hangs and issues.  I just wanted to say that we
> >> really do see them, and have tried many things to avoid them.
> >>
> >> Here's the note from my team:
> >>
> >> We were hitting 30 minute timeouts on getxattr/system.posix_acl_access
> >> calls on directories in a NFS v3 mount (w/ acl option) of a 10-node
> >> 40-brick gluster 3.4.0 volume.  Strace shows where the client hangs:
> >>
> >> $ strace -tt -T getfacl d6h_take1
> >> ...
> >> 18:43:57.929225 lstat("d6h_take1", {st_mode=S_IFDIR|0755,
> >> st_size=7024, ...}) = 0 <0.257107>
> >> 18:43:58.186461 getxattr("d6h_take1", "system.posix_acl_access",
> >> 0x7fffdf2b9f50, 132) = -1 ENODATA (No data available) <1806.296893>
> >> 19:14:04.483556 stat("d6h_take1", {st_mode=S_IFDIR|0755, st_size=7024,
> >> ...}) = 0 <0.642362>
> >> 19:14:05.126025 getxattr("d6h_take1", "system.posix_acl_default",
> >> 0x7fffdf2b9f50, 132) = -1 ENODATA (No data
> >> available) <0.000024>
> >> 19:14:05.126114 stat("d6h_take1", {st_mode=S_IFDIR|0755, st_size=7024,
> >> ...}) = 0 <0.000010>
> >> ...
> >>
> >> Load on the servers was moderate.  While the above was hanging,
> >> getfacl worked nearly instantaneously on that directory on all bricks.
> >>  When it finally hit the 30 minute timeout, gluster logged it in
> >> nfs.log:
> >>
> >> [2014-03-31 23:14:04.481154] E [rpc-clnt.c:207:call_bail]
> >> 0-holyscratch-client-36: bailing out frame type(GlusterFS 3.3)
> >> op(GETXATTR(18)) xid = 0x8168809x sent = 2014-03-31 22:43:58.442411.
> >> timeout = 1800
> >> [2014-03-31 23:14:04.481233] W
> >> [client-rpc-fops.c:1112:client3_3_getxattr_cbk]
> >> 0-holyscratch-client-36: remote operation failed: Transport endpoint
> >> is not connected. Path: <gfid:b116fb01-b13d-448a-90d0-a8693a98698b>
> >> (b116fb01-b13d-448a-90d0-a8693a98698b). Key: (null)
> >>
> >> Other than that, we didn't see anything directly related in the nfs or
> >> brick logs or anything out of sorts with the gluster services.  A
> >> couple other errors raise eyebrows, but these are different
> >> directories (neighbors of the example above) and at different times:
> >>
> >> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:30:47.794454]
> >> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
> >> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
> >> overlaps=0
> >> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:31:47.794447]
> >> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
> >> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
> >> overlaps=0
> >> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:33:47.802135]
> >> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
> >> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
> >> overlaps=0
> >> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:34:47.802182]
> >> I
> >> [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
> >> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
> >> overlaps=0
> >> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:36:47.764329]
> >> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
> >> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
> >> overlaps=0
> >> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:37:47.773164]
> >> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
> >> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
> >> overlaps=0
> >> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:39:47.774285]
> >> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
> >> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
> >> overlaps=0
> >> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:40:47.780338]
> >> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht:
> >> found
> >> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
> >> overlaps=0
> >> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:42:47.730345]
> >> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
> >> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
> >> overlaps=0
> >>
> >> holyscratch08:
> >> /var/log/glusterfs/bricks/holyscratch08_03-brick.log:[2014-03-31
> >> 00:57:51.973565] E [posix-helpers.c:696:posix_handle_pair]
> >> 0-holyscratch-posix:
> >> /holyscratch08_03/brick/ramanathan_lab/dhuh/d9_take2_BGI/cuffdiffRN.txt:
> >> key:system.posix_acl_access error:Invalid argument
> >> holyscratch08:
> >> /var/log/glusterfs/bricks/holyscratch08_03-brick.log:[2014-03-31
> >> 01:18:12.345818] E [posix-helpers.c:696:posix_handle_pair]
> >> 0-holyscratch-posix:
> >> /holyscratch08_03/brick/ramanathan_lab/dhuh/d9_take2_BGI/cuffdiffRN.txt:
> >> key:system.posix_acl_access error:Invalid argument
> >> holyscratch05:
> >> /var/log/glusterfs/bricks/holyscratch05_04-brick.log:[2014-03-31
> >> 21:16:37.057674] E [posix-helpers.c:696:posix_handle_pair]
> >> 0-holyscratch-posix:
> >>
> >> /holyscratch05_04/brick/ramanathan_lab/dhuh/d9_take2_BGI/Diffreg/cuffdiffRN.txt:
> >> key:system.posix_acl_access error:Invalid argument
> >>
> >> --
> >> dr. james cuff, assistant dean for research computing, harvard
> >> university | division of science | thirty eight oxford street,
> >> cambridge. ma. 02138 | +1 617 384 7647 | http://rc.fas.harvard.edu
> >>
> >>
> >> On Wed, Apr 9, 2014 at 9:52 AM,  <james.bellinger@xxxxxxxxxxxxxxxx> wrote:
> >>>
> >>>  I am seeing something perhaps similar.  3.4.2-1, 2 servers, each with 1
> >>>  brick, replicated.  A du of a local (ZFS) directory tree of 297834 files
> >>>  and 525GB takes about 17 minutes.  A du of the gluster copy
> >>> is still not
> >>>  finished after 22 hours.  Network activity has been about 5-6KB/sec
> >>> until
> >>>  (I gather) du hit a directory with 22450 files, when activity jumped to
> >>>  300KB/sec (200 packets/sec) for about 15-20 minutes.  If I assume that
> >>> the
> >>>  spike came from scanning the two largest directories, that looks like
> >>>  about 8K of traffic per file, and about 5 packets.
> >>>
> >>>  A 3.3.2 gluster installation that we are trying to retire is not
> >>> afflicted
> >>>  this way.
> >>>
> >>>  James Bellinger
> >>>
> >>>>
> >>>>  Am I the only person using Gluster suffering from very slow directory
> >>>>  access? It's so seriously bad that it almost makes Gluster unusable.
> >>>>
> >>>>  Using NFS instead of the Fuse client masks the problem as long as the
> >>>>  directories are cached but it's still hellishly slow when you first
> >>>>  access them.
> >>>>
> >>>>  Has there
> >>>> been any progress at all fixing this bug?
> >>>>
> >>>>  https://bugzilla.redhat.com/show_bug.cgi?id=1067256
> >>>>
> >>>>  Cheers,
> >>>>
> >>>> ________________________________
> >>>>
> >>>>  Gluster-users mailing list
> >>>>  Gluster-users@xxxxxxxxxxx
> >>>>  http://supercolony.gluster.org/mailman/listinfo/gluster-users
> >>>
> >>>
> >>>
> >>>
> >>> ________________________________
> >>>
> >>>  Gluster-users mailing list
> >>>  Gluster-users@xxxxxxxxxxx
> >>>  http://supercolony.gluster.org/mailman/listinfo/gluster-users
> >>
> >> ________________________________
> >>
> >> Gluster-users mailing list
> >> Gluster-users@xxxxxxxxxxx
> >> http://supercolony.gluster.org/mailman/listinfo/gluster-users
> >
> >
> > --
> > Sent from my Android device with K-9 Mail. Please excuse my brevity.
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@xxxxxxxxxxx
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
> 
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users