Re: blocking process on FUSE mount in directory which is using quota

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Thu, Aug 9, 2018 at 6:47 PM, mabi <mabi@xxxxxxxxxxxxx> wrote:
Hi Nithya,

Thanks for the fast answer. Here the additional info:

1. gluster volume info

Volume Name: myvol-private
Type: Replicate
Volume ID: e7a40a1b-45c9-4d3c-bb19-0c59b4eceec5
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: gfs1a:/data/myvol-private/brick
Brick2: gfs1b:/data/myvol-private/brick
Brick3: gfs1c:/srv/glusterfs/myvol-private/brick (arbiter)
Options Reconfigured:
features.default-soft-limit: 95%
transport.address-family: inet
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
nfs.disable: on
performance.readdir-ahead: on
client.event-threads: 4
server.event-threads: 4
auth.allow: 192.168.100.92



2. Sorry I have no clue how to take a "statedump" of a process on Linux. Which command should I use for that? and which process would you like, the blocked process (for example "ls")?

Statedumps are gluster specific. Please refer to https://docs.gluster.org/en/v3/Troubleshooting/statedump/ for instructions.


Regards,
M.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On August 9, 2018 3:10 PM, Nithya Balachandran <nbalacha@xxxxxxxxxx> wrote:

Hi,

Please provide the following:
  1. gluster volume info
  2. statedump of the fuse process when it hangs

Thanks,
Nithya

On 9 August 2018 at 18:24, mabi <mabi@xxxxxxxxxxxxx> wrote:
Hello,

I recently upgraded my GlusterFS replica 2+1 (aribter) to version 3.12.12 and now I see a weird behaviour on my client (using FUSE mount) where I have processes (PHP 5.6 FPM) trying to access a specific directory and then the process blocks. I can't kill the process either, not even with kill -9. I need to reboot the machine in order to get rid of these blocked processes.

This directory has one particularity compared to the other directories it is that it has reached it's quota soft-limit as you can see here in the output of gluster volume quota list:

                  Path                   Hard-limit  Soft-limit      Used  Available  Soft-limit exceeded? Hard-limit exceeded?
-------------------------------------------------------------------------------------------------------------------------------
/directory                          100.0GB     80%(80.0GB)   90.5GB   9.5GB             Yes                   No

That does not mean that it is the quota's fault but it might be a hint where to start looking for... And by the way can someone explain me what the soft-limit does? or does it not do anything special?

Here is an the linux stack of a blocking process on that directory which happened with a simple "ls -la":

[Thu Aug  9 14:21:07 2018] INFO: task ls:2272 blocked for more than 120 seconds.
[Thu Aug  9 14:21:07 2018]       Not tainted 3.16.0-4-amd64 #1
[Thu Aug  9 14:21:07 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Thu Aug  9 14:21:07 2018] ls              D ffff88017ef93200     0  2272   2268 0x00000004
[Thu Aug  9 14:21:07 2018]  ffff88017653f490 0000000000000286 0000000000013200 ffff880174d7bfd8
[Thu Aug  9 14:21:07 2018]  0000000000013200 ffff88017653f490 ffff8800eeb3d5f0 ffff8800fefac800
[Thu Aug  9 14:21:07 2018]  ffff880174d7bbe0 ffff8800eeb3d6d0 ffff8800fefac800 ffff8800ffe1e1c0
[Thu Aug  9 14:21:07 2018] Call Trace:
[Thu Aug  9 14:21:07 2018]  [<ffffffffa00dc50d>] ? __fuse_request_send+0xbd/0x270 [fuse]
[Thu Aug  9 14:21:07 2018]  [<ffffffff810abce0>] ? prepare_to_wait_event+0xf0/0xf0
[Thu Aug  9 14:21:07 2018]  [<ffffffffa00e0791>] ? fuse_dentry_revalidate+0x181/0x300 [fuse]
[Thu Aug  9 14:21:07 2018]  [<ffffffff811b944e>] ? lookup_fast+0x25e/0x2b0
[Thu Aug  9 14:21:07 2018]  [<ffffffff811bacc5>] ? path_lookupat+0x155/0x780
[Thu Aug  9 14:21:07 2018]  [<ffffffff81195715>] ? kmem_cache_alloc+0x75/0x480
[Thu Aug  9 14:21:07 2018]  [<ffffffffa00dfca9>] ? fuse_getxattr+0xe9/0x150 [fuse]
[Thu Aug  9 14:21:07 2018]  [<ffffffff811bb316>] ? filename_lookup+0x26/0xc0
[Thu Aug  9 14:21:07 2018]  [<ffffffff811bf594>] ? user_path_at_empty+0x54/0x90
[Thu Aug  9 14:21:07 2018]  [<ffffffff81193e08>] ? kmem_cache_free+0xd8/0x210
[Thu Aug  9 14:21:07 2018]  [<ffffffff811bf59f>] ? user_path_at_empty+0x5f/0x90
[Thu Aug  9 14:21:07 2018]  [<ffffffff811b3d46>] ? vfs_fstatat+0x46/0x90
[Thu Aug  9 14:21:07 2018]  [<ffffffff811b421d>] ? SYSC_newlstat+0x1d/0x40
[Thu Aug  9 14:21:07 2018]  [<ffffffff811d34b8>] ? SyS_lgetxattr+0x58/0x80
[Thu Aug  9 14:21:07 2018]  [<ffffffff81525d0d>] ? system_call_fast_compare_end+0x10/0x15


My 3 gluster nodes are all Debian 9 and my client Debian 8.

Let me know if you need more information.

Best regards,
Mabi
_______________________________________________
Gluster-users mailing list


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux