It looks indeed like some bug with the RDMA protocol implementation. I
tested TCP and it works fine. It's a huge bummer for me because my network
links work pretty solidly at 50Gb/s in RDMA, while IPoIB gives me (on the
best case) less than 30Gb/s :/
Also, I couldn't create a volume with transport mode tcp,rdma. The logs
aren't very helpfull: they just say "failed to create volfile" and "could
not generate gfproxy client volfiles". If I create a TCP I can later
change it to tcp,rdma, but it's not cleanly achieved, and the resulting
volume doesn't work (one might be able to mount it, but writes always
fail).
Lindolfo Meira, MSc
Diretor Geral, Centro Nacional de Supercomputação
Universidade Federal do Rio Grande do Sul
+55 (51) 3308-3139
On Thu, 24 Jan 2019, Jim Kinney wrote:
> I have rdma capability. Will test and report back. I'm still on v 3.12.
>
> On January 24, 2019 12:54:26 AM EST, Amar Tumballi Suryanarayan <atumball@xxxxxxxxxx> wrote:
> >I suspect this is a bug with 'Transport: rdma' part. We have called out
> >for
> >de-scoping that feature as we are lacking experts in that domain right
> >now.
> >Recommend you to use IPoIB option, and use tcp/socket transport type
> >(which
> >is default). That should mostly fix all the issues.
> >
> >-Amar
> >
> >On Thu, Jan 24, 2019 at 5:31 AM Jim Kinney <jim.kinney@xxxxxxxxx>
> >wrote:
> >
> >> That really sounds like a bug with the sharding. I'm not using
> >sharding on
> >> my setup and files are writeable (vim) with 2 bytes and no errors
> >occur.
> >> Perhaps the small size is cached until it's large enough to trigger a
> >write
> >>
> >> On Wed, 2019-01-23 at 21:46 -0200, Lindolfo Meira wrote:
> >>
> >> Also I noticed that any subsequent write (after the first write with
> >340
> >>
> >> bytes or more), regardless the size, will work as expected.
> >>
> >>
> >>
> >> Lindolfo Meira, MSc
> >>
> >> Diretor Geral, Centro Nacional de Supercomputação
> >>
> >> Universidade Federal do Rio Grande do Sul
> >>
> >> +55 (51) 3308-3139
> >>
> >>
> >> On Wed, 23 Jan 2019, Lindolfo Meira wrote:
> >>
> >>
> >> Just checked: when the write is >= 340 bytes, everything works as
> >>
> >> supposed. If the write is smaller, the error takes place. And when it
> >>
> >> does, nothing is logged on the server. The client, however, logs the
> >>
> >> following:
> >>
> >>
> >> [2019-01-23 23:28:54.554664] W [MSGID: 103046]
> >>
> >> [rdma.c:3502:gf_rdma_decode_header] 0-rpc-transport/rdma: received a
> >msg
> >>
> >> of type RDMA_ERROR
> >>
> >>
> >> [2019-01-23 23:28:54.554728] W [MSGID: 103046]
> >>
> >> [rdma.c:3939:gf_rdma_process_recv] 0-rpc-transport/rdma: peer
> >>
> >> (172.24.1.6:49152), couldn't encode or decode the msg properly or
> >write
> >>
> >> chunks were not provided for replies that were bigger than
> >>
> >> RDMA_INLINE_THRESHOLD (2048)
> >>
> >>
> >> [2019-01-23 23:28:54.554765] W [MSGID: 114031]
> >>
> >> [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-gfs-client-5:
> >remote
> >>
> >> operation failed [Transport endpoint is not connected]
> >>
> >>
> >> [2019-01-23 23:28:54.554850] W [fuse-bridge.c:1436:fuse_err_cbk]
> >>
> >> 0-glusterfs-fuse: 1723199: FLUSH() ERR => -1 (Transport endpoint is
> >not
> >>
> >> connected)
> >>
> >>
> >>
> >>
> >> Lindolfo Meira, MSc
> >>
> >> Diretor Geral, Centro Nacional de Supercomputação
> >>
> >> Universidade Federal do Rio Grande do Sul
> >>
> >> +55 (51) 3308-3139
> >>
> >>
> >> On Wed, 23 Jan 2019, Lindolfo Meira wrote:
> >>
> >>
> >> Hi Jim. Thanks for taking the time.
> >>
> >>
> >> Sorry I didn't express myself properly. It's not a simple matter of
> >>
> >> permissions. Users can write to the volume alright. It's when vim and
> >nano
> >>
> >> are used, or when small file writes are performed (by cat or echo),
> >that
> >>
> >> it doesn't work. The file is updated with the write in the server,
> >but it
> >>
> >> shows up as empty in the client.
> >>
> >>
> >> I guess it has something to do with the size of the write, because I
> >ran a
> >>
> >> test writing to a file one byte at a time, and it never showed up as
> >>
> >> having any content in the client (although in the server it kept
> >growing
> >>
> >> accordingly).
> >>
> >>
> >> I should point out that I'm using a sharded volume. But when I was
> >testing
> >>
> >> a striped volume, it also happened. Output of "gluster volume info"
> >>
> >> follows bellow:
> >>
> >>
> >> Volume Name: gfs
> >>
> >> Type: Distribute
> >>
> >> Volume ID: b5ef065f-1ba2-481f-8108-e8f6d2d3f036
> >>
> >> Status: Started
> >>
> >> Snapshot Count: 0
> >>
> >> Number of Bricks: 6
> >>
> >> Transport-type: rdma
> >>
> >> Bricks:
> >>
> >> Brick1: pfs01-ib:/mnt/data
> >>
> >> Brick2: pfs02-ib:/mnt/data
> >>
> >> Brick3: pfs03-ib:/mnt/data
> >>
> >> Brick4: pfs04-ib:/mnt/data
> >>
> >> Brick5: pfs05-ib:/mnt/data
> >>
> >> Brick6: pfs06-ib:/mnt/data
> >>
> >> Options Reconfigured:
> >>
> >> nfs.disable: on
> >>
> >> features.shard: on
> >>
> >>
> >>
> >>
> >> Lindolfo Meira, MSc
> >>
> >> Diretor Geral, Centro Nacional de Supercomputação
> >>
> >> Universidade Federal do Rio Grande do Sul
> >>
> >> +55 (51) 3308-3139
> >>
> >>
> >> On Wed, 23 Jan 2019, Jim Kinney wrote:
> >>
> >>
> >> Check permissions on the mount. I have multiple dozens of systems
> >>
> >> mounting 18 "exports" using fuse and it works for multiple user
> >>
> >> read/write based on user access permissions to the mount point space.
> >>
> >> /home is mounted for 150+ users plus another dozen+ lab storage
> >spaces.
> >>
> >> I do manage user access with freeIPA across all systems to keep
> >things
> >>
> >> consistent.
> >>
> >> On Wed, 2019-01-23 at 19:31 -0200, Lindolfo Meira wrote:
> >>
> >> Am I missing something here? A mere write operation, using vim or
> >>
> >> nano, cannot be performed on a gluster volume mounted over fuse! What
> >>
> >> gives?
> >>
> >> Lindolfo Meira, MScDiretor Geral, Centro Nacional de
> >>
> >> SupercomputaçãoUniversidade Federal do Rio Grande do Sul+55 (51)
> >>
> >> 3308-3139_______________________________________________Gluster-users
> >>
> >> mailing
> >>
> >> listGluster-users@xxxxxxxxxxx
> >>
> >>
> >> https://lists.gluster.org/mailman/listinfo/gluster-users
> >>
> >>
> >> --
> >>
> >> James P. Kinney III
> >>
> >>
> >> Every time you stop a school, you will have to build a jail. What you
> >>
> >> gain at one end you lose at the other. It's like feeding a dog on his
> >>
> >> own tail. It won't fatten the dog.
> >>
> >> - Speech 11/23/1900 Mark Twain
> >>
> >>
> >> http://heretothereideas.blogspot.com/
> >>
> >>
> >>
> >> --
> >>
> >> James P. Kinney III Every time you stop a school, you will have to
> >build a
> >> jail. What you gain at one end you lose at the other. It's like
> >feeding a
> >> dog on his own tail. It won't fatten the dog. - Speech 11/23/1900
> >Mark
> >> Twain http://heretothereideas.blogspot.com/
> >>
> >> _______________________________________________
> >> Gluster-users mailing list
> >> Gluster-users@xxxxxxxxxxx
> >> https://lists.gluster.org/mailman/listinfo/gluster-users
> >
> >
> >
> >--
> >Amar Tumballi (amarts)
>
> --
> Sent from my Android device with K-9 Mail. All tyopes are thumb related and reflect authenticity.
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users