Hi Dan,
On Mon, Jul 12, 2021 at 2:20 PM Dan Thomson <dthomson@xxxxxxxxx> wrote:
Hi gluster users,
I'm having an issue that I'm hoping to get some help with on a
dispersed volume (EC: 2x(4+2)) that's causing me some headaches. This is
on a cluster running Gluster 6.9 on CentOS 7.
At some point in the last week, writes to one of my bricks have started
failing due to an "No Space Left on Device" error:
[2021-07-06 16:08:57.261307] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-gluster-01-server: 1853436561: WRITEV -2 (f2d6f2f8-4fd7-4692-bd60-23124897be54), client: CTX_ID:648a7383-46c8-4ed7-a921-acafc90bec1a-GRAPH_ID:4-PID:19471-HOST:rhevh08.mgmt.triumf.ca-PC_NAME:gluster-01-client-5-RECON_NO:-5, error-xlator: gluster-01-posix [No space left on device]
The disk is quite full (listed as 100% on the server), but does have
some writable room left:
/dev/mapper/vg--brick1-brick1 11T 11T 97G 100% /data/glusterfs/gluster-01/brick1
however, I'm not sure if the amount of disk space used on the physical
drive is the true cause of the "No Space Left on Device" errors anyway.
I can still manually write to this brick outside of Gluster, so it seems
like the operating system isn't preventing the writes from happening.
As Strahil has said, you are probably hitting the minimum space reserved by Gluster. You can try those options. However I don't recommend keeping bricks above 90% utilization. All filesystems, including XFS, tend to degrade performance when available space is limited. If the brick's filesystem works worse, Gluster performance will also drop.
During my investigation, I noticed that one .glusterfs paths on the problem
server is using up much more space than it is on the other servers. I can't
quite figure out why that might be, or how that happened. I'm wondering
if there's any advice on what the cause might've been.
I had done some package updates on this server with the issue and not on the
other servers. This included the kernel version, but didn't include the Gluster
packages. So possibly this, or the reboot to load the new kernel may
have caused a problem. I have scripts on my gluster machines to nicely kill
all of the brick processes before rebooting, so I'm not leaning towards
an abrupt shutdown being the cause, but it's a possibility.
I'm also looking for advice on how to safely remove the problem file and
rebuild it from the other Gluster peers. I've seen some documentation on
this, but I'm a little nervous about corrupting the volume if I
misunderstand the process. I'm not free to take the volume or cluster down and
do maintenance at this point, but that might be something I'll have to consider
if it's my only option.
For reference, here's the comparison of the same path that seems to be
taking up extra space on one of the hosts:
1: 26G /data/gluster-01/brick1/vol/.glusterfs/99/56
2: 26G /data/gluster-01/brick1/vol/.glusterfs/99/56
3: 26G /data/gluster-01/brick1/vol/.glusterfs/99/56
4: 26G /data/gluster-01/brick1/vol/.glusterfs/99/56
5: 26G /data/gluster-01/brick1/vol/.glusterfs/99/56
6: 3.0T /data/gluster-01/brick1/vol/.glusterfs/99/56
This is not normal at all. In a dispersed volume all bricks should use roughly the same used space.
Can you provide the output of the following commands:
# gluster volume info <volname>
# gluster volume status <volname>
Also provide the output of this command from all bricks:
# ls -ls /data/gluster-01/brick1/vol/.glusterfs/99/56
Regards,
Xavi
Any and all advice is appreciated.
Thanks!
--
Daniel Thomson
DevOps Engineer
t +1 604 222 7428
dthomson@xxxxxxxxx
TRIUMF Canada's particle accelerator centre
www.triumf.ca @TRIUMFLab
4004 Wesbrook Mall
Vancouver BC V6T 2A3 Canada
________
Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users