Re: Horrendously slow directory access

John Brunelle <john_brunelle@xxxxxxxxxxx> · Thu, 10 Apr 2014 11:08:47 -0400



Sure thing, #1086303:

https://bugzilla.redhat.com/show_bug.cgi?id=1086303

On Thu, Apr 10, 2014 at 7:04 AM, John Mark Walker <jowalker@xxxxxxxxxx> wrote:
> Hi James,
>
> This definitely looks worthy of investigation. Could you file a bug? We need to get our guys on this.
>
> Thanks for doing your homework. Send us the BZ #, and we'll start poking around.
>
> -JM
>
>
> ----- Original Message -----
>> Hey Joe!
>>
>> Yeah we are all XFS all the time round here - none of that nasty ext4
>> combo that we know causes raised levels of mercury :-)
>>
>> The brick errors, we have not seen any we have been busy grepping and
>> alerting on anything suspect in our logs.  Mind you there are hundreds
>> of brick logs to search through I'm not going to say we may have
>> missed one, but after asking the boys in chat just now they are pretty
>> convinced that was not the smoking gun.  I'm sure they will chip in on
>> this thread if there is anything.
>>
>>
>> j.
>>
>> --
>> dr. james cuff, assistant dean for research computing, harvard
>> university | division of science | thirty eight oxford street,
>> cambridge. ma. 02138 | +1 617 384 7647 | http://rc.fas.harvard.edu
>>
>>
>> On Wed, Apr 9, 2014 at 10:36 AM, Joe Julian <joe@xxxxxxxxxxxxxxxx> wrote:
>> > What's the backend filesystem?
>> > Were there any brick errors, probably around 2014-03-31 22:44:04 (half an
>> > hour before the frame timeout)?
>> >
>> >
>> > On April 9, 2014 7:10:58 AM PDT, James Cuff <james_cuff@xxxxxxxxxxx> wrote:
>> >>
>> >> Hi team,
>> >>
>> >> I hate "me too" emails sometimes not at all constructive, but I feel I
>> >> really ought chip in from real world systems we use in anger and at
>> >> massive scale here.
>> >>
>> >> So we also use NFS to "mask" this and other performance issues.  The
>> >> cluster.readdir-optimize gave us similar results unfortunately.
>> >>
>> >> We reported our other challenge back last summer but we stalled on this:
>> >>
>> >> http://www.gluster.org/pipermail/gluster-users/2013-June/036252.html
>> >>
>> >> We also unfortunately now see a new NFS phenotype that I've pasted
>> >> below which is again is causing real heartburn.
>> >>
>> >> Small files, always difficult for any FS, might be worth doing some
>> >> regression testing with small file directory scenarios in test - it's
>> >> an easy reproducer on even moderately sized gluster clusters.  Hope
>> >> some good progress can be
>> >> made, and I understand it's a tough one to
>> >> track down performance hangs and issues.  I just wanted to say that we
>> >> really do see them, and have tried many things to avoid them.
>> >>
>> >> Here's the note from my team:
>> >>
>> >> We were hitting 30 minute timeouts on getxattr/system.posix_acl_access
>> >> calls on directories in a NFS v3 mount (w/ acl option) of a 10-node
>> >> 40-brick gluster 3.4.0 volume.  Strace shows where the client hangs:
>> >>
>> >> $ strace -tt -T getfacl d6h_take1
>> >> ...
>> >> 18:43:57.929225 lstat("d6h_take1", {st_mode=S_IFDIR|0755,
>> >> st_size=7024, ...}) = 0 <0.257107>
>> >> 18:43:58.186461 getxattr("d6h_take1", "system.posix_acl_access",
>> >> 0x7fffdf2b9f50, 132) = -1 ENODATA (No data available) <1806.296893>
>> >> 19:14:04.483556 stat("d6h_take1", {st_mode=S_IFDIR|0755, st_size=7024,
>> >> ...}) = 0 <0.642362>
>> >> 19:14:05.126025 getxattr("d6h_take1", "system.posix_acl_default",
>> >> 0x7fffdf2b9f50, 132) = -1 ENODATA (No data
>> >> available) <0.000024>
>> >> 19:14:05.126114 stat("d6h_take1", {st_mode=S_IFDIR|0755, st_size=7024,
>> >> ...}) = 0 <0.000010>
>> >> ...
>> >>
>> >> Load on the servers was moderate.  While the above was hanging,
>> >> getfacl worked nearly instantaneously on that directory on all bricks.
>> >>  When it finally hit the 30 minute timeout, gluster logged it in
>> >> nfs.log:
>> >>
>> >> [2014-03-31 23:14:04.481154] E [rpc-clnt.c:207:call_bail]
>> >> 0-holyscratch-client-36: bailing out frame type(GlusterFS 3.3)
>> >> op(GETXATTR(18)) xid = 0x8168809x sent = 2014-03-31 22:43:58.442411.
>> >> timeout = 1800
>> >> [2014-03-31 23:14:04.481233] W
>> >> [client-rpc-fops.c:1112:client3_3_getxattr_cbk]
>> >> 0-holyscratch-client-36: remote operation failed: Transport endpoint
>> >> is not connected. Path: <gfid:b116fb01-b13d-448a-90d0-a8693a98698b>
>> >> (b116fb01-b13d-448a-90d0-a8693a98698b). Key: (null)
>> >>
>> >> Other than that, we didn't see anything directly related in the nfs or
>> >> brick logs or anything out of sorts with the gluster services.  A
>> >> couple other errors raise eyebrows, but these are different
>> >> directories (neighbors of the example above) and at different times:
>> >>
>> >> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:30:47.794454]
>> >> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
>> >> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
>> >> overlaps=0
>> >> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:31:47.794447]
>> >> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
>> >> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
>> >> overlaps=0
>> >> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:33:47.802135]
>> >> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
>> >> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
>> >> overlaps=0
>> >> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:34:47.802182]
>> >> I
>> >> [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
>> >> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
>> >> overlaps=0
>> >> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:36:47.764329]
>> >> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
>> >> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
>> >> overlaps=0
>> >> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:37:47.773164]
>> >> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
>> >> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
>> >> overlaps=0
>> >> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:39:47.774285]
>> >> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
>> >> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
>> >> overlaps=0
>> >> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:40:47.780338]
>> >> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht:
>> >> found
>> >> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
>> >> overlaps=0
>> >> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:42:47.730345]
>> >> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
>> >> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
>> >> overlaps=0
>> >>
>> >> holyscratch08:
>> >> /var/log/glusterfs/bricks/holyscratch08_03-brick.log:[2014-03-31
>> >> 00:57:51.973565] E [posix-helpers.c:696:posix_handle_pair]
>> >> 0-holyscratch-posix:
>> >> /holyscratch08_03/brick/ramanathan_lab/dhuh/d9_take2_BGI/cuffdiffRN.txt:
>> >> key:system.posix_acl_access error:Invalid argument
>> >> holyscratch08:
>> >> /var/log/glusterfs/bricks/holyscratch08_03-brick.log:[2014-03-31
>> >> 01:18:12.345818] E [posix-helpers.c:696:posix_handle_pair]
>> >> 0-holyscratch-posix:
>> >> /holyscratch08_03/brick/ramanathan_lab/dhuh/d9_take2_BGI/cuffdiffRN.txt:
>> >> key:system.posix_acl_access error:Invalid argument
>> >> holyscratch05:
>> >> /var/log/glusterfs/bricks/holyscratch05_04-brick.log:[2014-03-31
>> >> 21:16:37.057674] E [posix-helpers.c:696:posix_handle_pair]
>> >> 0-holyscratch-posix:
>> >>
>> >> /holyscratch05_04/brick/ramanathan_lab/dhuh/d9_take2_BGI/Diffreg/cuffdiffRN.txt:
>> >> key:system.posix_acl_access error:Invalid argument
>> >>
>> >> --
>> >> dr. james cuff, assistant dean for research computing, harvard
>> >> university | division of science | thirty eight oxford street,
>> >> cambridge. ma. 02138 | +1 617 384 7647 | http://rc.fas.harvard.edu
>> >>
>> >>
>> >> On Wed, Apr 9, 2014 at 9:52 AM,  <james.bellinger@xxxxxxxxxxxxxxxx> wrote:
>> >>>
>> >>>  I am seeing something perhaps similar.  3.4.2-1, 2 servers, each with 1
>> >>>  brick, replicated.  A du of a local (ZFS) directory tree of 297834 files
>> >>>  and 525GB takes about 17 minutes.  A du of the gluster copy
>> >>> is still not
>> >>>  finished after 22 hours.  Network activity has been about 5-6KB/sec
>> >>> until
>> >>>  (I gather) du hit a directory with 22450 files, when activity jumped to
>> >>>  300KB/sec (200 packets/sec) for about 15-20 minutes.  If I assume that
>> >>> the
>> >>>  spike came from scanning the two largest directories, that looks like
>> >>>  about 8K of traffic per file, and about 5 packets.
>> >>>
>> >>>  A 3.3.2 gluster installation that we are trying to retire is not
>> >>> afflicted
>> >>>  this way.
>> >>>
>> >>>  James Bellinger
>> >>>
>> >>>>
>> >>>>  Am I the only person using Gluster suffering from very slow directory
>> >>>>  access? It's so seriously bad that it almost makes Gluster unusable.
>> >>>>
>> >>>>  Using NFS instead of the Fuse client masks the problem as long as the
>> >>>>  directories are cached but it's still hellishly slow when you first
>> >>>>  access them.
>> >>>>
>> >>>>  Has there
>> >>>> been any progress at all fixing this bug?
>> >>>>
>> >>>>  https://bugzilla.redhat.com/show_bug.cgi?id=1067256
>> >>>>
>> >>>>  Cheers,
>> >>>>
>> >>>> ________________________________
>> >>>>
>> >>>>  Gluster-users mailing list
>> >>>>  Gluster-users@xxxxxxxxxxx
>> >>>>  http://supercolony.gluster.org/mailman/listinfo/gluster-users
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> ________________________________
>> >>>
>> >>>  Gluster-users mailing list
>> >>>  Gluster-users@xxxxxxxxxxx
>> >>>  http://supercolony.gluster.org/mailman/listinfo/gluster-users
>> >>
>> >> ________________________________
>> >>
>> >> Gluster-users mailing list
>> >> Gluster-users@xxxxxxxxxxx
>> >> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>> >
>> >
>> > --
>> > Sent from my Android device with K-9 Mail. Please excuse my brevity.
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users@xxxxxxxxxxx
>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@xxxxxxxxxxx
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users