Please help troubleshooting glusterfs with the following setup:
Distributed volume without replication. Sharding enabled.
# cat /etc/centos-release
CentOS release 6.9 (Final)
# glusterfs --version
glusterfs 3.12.3
[root@master-5f81bad0054a11e8bf7d0671029ed6b8 uploads]# gluster volume info
Volume Name: gv0
Type: Distribute
Volume ID: 1a7e05f6-4aa8-48d3-b8e3-300637031925
Status: Started
Snapshot Count: 0
Number of Bricks: 27
Transport-type: tcp
Bricks:
Brick1: gluster3.qencode.com:/var/storage/brick/gv0
Brick2: encoder-376cac0405f311e884700671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick3: encoder-ee6761c0091c11e891ba0671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick4: encoder-ee68b8ea091c11e89c2d0671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick5: encoder-ee663700091c11e8b48f0671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick6: encoder-efcf113e091c11e899520671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick7: encoder-efcd5a24091c11e8963a0671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick8: encoder-099f557e091d11e882f70671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick9: encoder-099bdda4091d11e881090671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick10: encoder-099dca56091d11e8b3410671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick11: encoder-09a1ba4e091d11e8a3c20671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick12: encoder-099a826a091d11e895940671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick13: encoder-0998aa8a091d11e8a8160671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick14: encoder-0b582724091d11e8b3b40671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick15: encoder-0dff527c091d11e896f20671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick16: encoder-0e0d5c14091d11e886cf0671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick17: encoder-7f1bf3d4093b11e8a3580671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick18: encoder-7f70378c093b11e885260671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick19: encoder-7f19528c093b11e88f100671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick20: encoder-7f76c048093b11e8a7470671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick21: encoder-7f7fc90e093b11e8a74e0671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick22: encoder-7f6bc382093b11e8b8a30671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick23: encoder-7f7b44d8093b11e8906f0671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick24: encoder-7f72aa30093b11e89a8e0671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick25: encoder-7f7d735c093b11e8b4650671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick26: encoder-7f1a5006093b11e89bcb0671029ed6b8.qencode.com:/var/storage/brick/gv0
Brick27: encoder-95791076093b11e8af170671029ed6b8.qencode.com:/var/storage/brick/gv0
Options Reconfigured:
cluster.min-free-disk: 10%
performance.cache-max-file-size: 1048576
nfs.disable: on
transport.address-family: inet
features.shard: on
performance.client-io-threads: on
Each brick is 15Gb size.
After using volume for several hours with intensive read/write operations (~300GB written and then deleted) an attempt to write to volume results in an Input/Output error:
# wget https://speed.hetzner.de/1GB.bin
--2018-02-04 12:02:34-- https://speed.hetzner.de/1GB.bin
Resolving speed.hetzner.de... 88.198.248.254, 2a01:4f8:0:59ed::2
Connecting to speed.hetzner.de|88.198.248.254|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1048576000 (1000M) [application/octet-stream]
Saving to: `1GB.bin'
38% [=============================================================> ] 403,619,518 27.8M/s in 15s
Cannot write to `1GB.bin' (Input/output error).
I don't see anything written to glusterd.log, or any other logs in /var/log/glusterfs/* when this error occurs.
Deleting partially downloaded file works without error.
Thanks,
Nikita Yeryomin