Re: cephfs hangs on writes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 4/26/22 2:06 AM, Vladimir Brik wrote:
> a), max_mds > 1 ?
No, but I had tried it in the past (i.e. set max_mds to 2, and then reverted back to 1)

> b), inline_data enabled ?
No

Okay, this is a different bug.


> c), how to reproduce it, could you provide the detail steps ?
Sometimes, but not always, something like this will hang:
dd if=/dev/zero of=zero bs=100M count=1

I am using the upstream code and have created thousands of file by using dd command, but couldn't reproduce it.

Could you try kernel-4.18.0-376.el8, which has been synced to the upstream recently ? Maybe this bug only existing in old versions.

-- Xiubo


We use cephfs a the shared storage for our cluster, and another way to reproduce it is to start many jobs that execute something like
date > <path_to_some_dir>/$RANDOM
In this case there is no hanging, but all files in path_to_some_dir are empty.

> d), could you enable the kernel debug log and set the
> debug_mds to 25 in MDSes and share the logs ?
As of this morning we began experiencing OSD cyclically crashing with "heartbeat_map is_healthy ... had suicide timed out" so the logs probably will have a lot of unrelated stuff until we fix that issue. I'll let you know when that happens


Vlad


On 4/24/22 23:40, Xiubo Li wrote:
Hi Vladimir,

This issue looks like the one I am working on now in [1], which is also a infinitely stuck bug when creating a new file and then writes something to it.

The issue [1] was caused by setting the max_mds > 1 and enabling the inline_data and then create a file and then write to it. It seems a deadlock in MDS vs kernel.


BTW, what's your setup for:

a), max_mds > 1 ?

b), inline_data enabled ?

c), how to reproduce it, could you provide the detail steps ?

d), could you enable the kernel debug log and set the debug_mds to 25 in MDSes and share the logs ?


[1] https://tracker.ceph.com/issues/55377

Thanks

BRs

-- Xiubo



On 4/25/22 5:25 AM, Vladimir Brik wrote:
Hello

We are experiencing an issue where, sometimes, when users write to cephfs an empty file is created and then the application hangs, seemingly indefinitely. I am sometimes able to reproduce with dd.

Does anybody know what might be going on?

Some details:
- ceph health complains about 100+ slow metadata IOs
- CPU utilization of ceph-mds is low
- We have almost 200 kernel cephfs clients
- Cephfs metadata is stored on 3 OSDs that use NVMe flash AICs


Vlad
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux