Re: parallel-readdir is not recognized in GlusterFS 3.12.4

Alan Orth <alan.orth@xxxxxxxxx> · Fri, 26 Jan 2018 11:59:54 +0000

Dear Vlad,

I'm sorry, I don't want to test this again on my system just yet! It caused too much instability for my users and I don't have enough resources for a development environment. The only other variables that changed before the crashes was the group metadata-cache[0], which I enabled the same day as the parallel-readdir and readdir-ahead options:

$ gluster volume set homes group metadata-cache

I'm hoping Atin or Poornima can shed some light and squash this bug.

[0] https://github.com/gluster/glusterfs/blob/release-3.11/doc/release-notes/3.11.0.md

Regards,

On Fri, Jan 26, 2018 at 6:10 AM Vlad Kopylov <vladkopy@xxxxxxxxx> wrote:
can you please test parallel-readdir or readdir-ahead gives

disconnects? so we know which to disable

parallel-readdir doing magic ran on pdf from last year

https://events.static.linuxfound.org/sites/events/files/slides/Gluster_DirPerf_Vault2017_0.pdf

-v

On Thu, Jan 25, 2018 at 8:20 AM, Alan Orth <alan.orth@xxxxxxxxx> wrote:

> By the way, on a slightly related note, I'm pretty sure either

> parallel-readdir or readdir-ahead has a regression in GlusterFS 3.12.x. We

> are running CentOS 7 with kernel-3.10.0-693.11.6.el7.x86_6.

>

> I updated my servers and clients to 3.12.4 and enabled these two options

> after reading about them in the 3.10.0 and 3.11.0 release notes. In the days

> after enabling these two options all of my clients kept getting disconnected

> from the volume. The error upon attempting to list a directory or read a

> file was "Transport endpoint is not connected", after which I would force

> unmount the volume with `umount -fl /home` and remount it, only to have it

> get disconnected again a few hours later.

>

> Every time the volume disconnected I looked in the client mount log and only

> found information such as:

>

> [2018-01-24 05:52:27.695225] I [MSGID: 108026]

> [afr-self-heal-common.c:1656:afr_log_selfheal] 2-homes-replicate-1:

> Completed metadata selfheal on ed3fbafc-734b-41ca-ab30-216399fb9168.

> sources=[0]  sinks=1

> [2018-01-24 05:52:27.700611] I [MSGID: 108026]

> [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]

> 2-homes-replicate-1: performing metadata selfheal on

> b6a53629-a831-4ee3-a35e-f47c04297aaa

> [2018-01-24 05:52:27.703021] I [MSGID: 108026]

> [afr-self-heal-common.c:1656:afr_log_selfheal] 2-homes-replicate-1:

> Completed metadata selfheal on b6a53629-a831-4ee3-a35e-f47c04297aaa.

> sources=[0]  sinks=1

>

> I enabled debug logging for that volume's client mount with `gluster volume

> set homes diagnostics.client-log-level DEBUG` and then I saw this in the

> client mount log the next time it disconnected:

>

> [2018-01-24 08:55:19.138810] D [MSGID: 0] [io-threads.c:358:iot_schedule]

> 0-homes-io-threads: LOOKUP scheduled as fast fop

> [2018-01-24 08:55:19.138849] D [MSGID: 0] [dht-common.c:2711:dht_lookup]

> 0-homes-dht: Calling fresh lookup for

> /vchebii/revtrans/Hircus-XM_018067032.1.pep.align.fas on

> homes-readdir-ahead-1

> [2018-01-24 08:55:19.138928] D [MSGID: 0] [io-threads.c:358:iot_schedule]

> 0-homes-io-threads: FSTAT scheduled as fast fop

> [2018-01-24 08:55:19.138958] D [MSGID: 0] [afr-read-txn.c:220:afr_read_txn]

> 0-homes-replicate-1: e6ee0427-b17d-4464-a738-e8ea70d77d95: generation now vs

> cached: 2, 2

> [2018-01-24 08:55:19.139187] D [MSGID: 0] [dht-common.c:2294:dht_lookup_cbk]

> 0-homes-dht: fresh_lookup returned for

> /vchebii/revtrans/Hircus-XM_018067032.1.pep.align.fas with op_ret 0

> [2018-01-24 08:55:19.139200] D [MSGID: 0]

> [dht-layout.c:873:dht_layout_preset] 0-homes-dht: file =

> 00000000-0000-0000-0000-000000000000, subvol = homes-readdir-ahead-1

> [2018-01-24 08:55:19.139257] D [MSGID: 0] [io-threads.c:358:iot_schedule]

> 0-homes-io-threads: READDIRP scheduled as fast fop

>

> On a hunch I disabled both parallel-readdir and readdir-ahead, which I had

> only enabled a few days before, and now all of the clients are much more

> stable, with zero disconnections in the days since I disabled those two

> volume options.

>

> Please take a look! Thanks,

>

> On Wed, Jan 24, 2018 at 5:59 AM Atin Mukherjee <amukherj@xxxxxxxxxx> wrote:

>>

>> Adding Poornima to take a look at it and comment.

>>

>> On Tue, Jan 23, 2018 at 10:39 PM, Alan Orth <alan.orth@xxxxxxxxx> wrote:

>>>

>>> Hello,

>>>

>>> I saw that parallel-readdir was an experimental feature in GlusterFS

>>> version 3.10.0, became stable in version 3.11.0, and is now recommended for

>>> small file workloads in the Red Hat Gluster Storage Server documentation[2].

>>> I've successfully enabled this on one of my volumes but I notice the

>>> following in the client mount log:

>>>

>>> [2018-01-23 10:24:24.048055] W [MSGID: 101174]

>>> [graph.c:363:_log_if_unknown_option] 0-homes-readdir-ahead-1: option

>>> 'parallel-readdir' is not recognized

>>> [2018-01-23 10:24:24.048072] W [MSGID: 101174]

>>> [graph.c:363:_log_if_unknown_option] 0-homes-readdir-ahead-0: option

>>> 'parallel-readdir' is not recognized

>>>

>>> The GlusterFS version on the client and server is 3.12.4. What is going

>>> on?

>>>

>>> [0]

>>> https://github.com/gluster/glusterfs/blob/release-3.10/doc/release-notes/3.10.0.md

>>> [1]

>>> https://github.com/gluster/glusterfs/blob/release-3.11/doc/release-notes/3.11.0.md

>>> [2]

>>> https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.3/html/administration_guide/small_file_performance_enhancements

>>>

>>> Thank you,

>>>

>>>

>>> --

>>>

>>> Alan Orth

>>> alan.orth@xxxxxxxxx

>>> https://picturingjordan.com

>>> https://englishbulgaria.net

>>> https://mjanja.ch

>>>

>>>

>>> _______________________________________________

>>> Gluster-users mailing list

>>> Gluster-users@xxxxxxxxxxx

>>> http://lists.gluster.org/mailman/listinfo/gluster-users

>>

>>

> --

>

> Alan Orth

> alan.orth@xxxxxxxxx

> https://picturingjordan.com

> https://englishbulgaria.net

> https://mjanja.ch

>

>

> _______________________________________________

> Gluster-users mailing list

> Gluster-users@xxxxxxxxxxx

> http://lists.gluster.org/mailman/listinfo/gluster-users

-- 
Alan Orth

alan.orth@xxxxxxxxx

https://picturingjordan.com

https://englishbulgaria.net

https://mjanja.ch

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users