Re: cephfs kernel client blocks when removing large files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 22, 2018 at 9:46 AM Dylan McCulloch <dmc@xxxxxxxxxxxxxx> wrote:
>
> On Mon, Oct 8, 2018 at 2:57 PM Dylan McCulloch <dmc@xxxxxxxxxxxxxx>> wrote:
> >>
> >> Hi all,
> >>
> >>
> >> We have identified some unexpected blocking behaviour by the ceph-fs kernel client.
> >>
> >>
> >> When performing 'rm' on large files (100+GB), there appears to be a significant delay of 10 seconds or more, before a 'stat' operation can be performed on the same directory on the filesystem.
> >>
> >>
> >> Looking at the kernel client's mds inflight-ops, we observe that there are pending
> >>
> >> UNLINK operations corresponding to the deleted files.
> >>
> >>
> >> We have noted some correlation between files being in the client page cache and the blocking behaviour. For example, if the cache is dropped or the filesystem remounted the blocking will not occur.
> >>
> >>
> >> Test scenario below:
> >>
> >>
> >> /mnt/cephfs_mountpoint type ceph (rw,relatime,name=ceph_filesystem,secret=<hidden>>,noshare,acl,wsize=16777216,rasize=268439552,caps_wanted_delay_min=1,caps_wanted_delay_max=1)
> >>
> >>
> >> Test1:
> >>
> >> 1) unmount & remount:
> >>
> >>
> >> 2) Add 10 x 100GB files to a directory:
> >>
> >>
> >> for i in {1..10}; do dd if=/dev/zero of=/mnt/cephfs_mountpoint/file$i.txt count=102400 bs=1048576; done
> >>
> >>
> >> 3) Delete all files in directory:
> >>
> >>
> >> for i in {1..10};do rm -f /mnt/cephfs_mountpoint/file$i.txt; done
> >>
> >>
> >> 4) Immediately perform ls on directory:
> >>
> >>
> >> time ls /mnt/cephfs_mountpoint/test1
> >>
> >>
> >> Result: delay ~16 seconds
> >>
> >> real    0m16.818s
> >>
> >> user    0m0.000s
> >>
> >> sys     0m0.002s
> >>
> >>
>
> > Are cephfs metadata pool and data pool on the same set of OSD. Is is
> > possible that heavy data IO slowed down metadata IO?
>
> Test results are from a new pre-production cluster that does not have any significant data IO. We've also confirmed the same behaviour on another cluster with similar configuration. Both clusters have separate device-class/crush rule for metadata pool using NVME OSDs, while the data pool uses HDD OSDs.
> Most metadata operations are unaffected. It appears that it is only metadata operations on files that exist in client page cache prior to rm that are delayed.
>

Ok. Please enable kernel debug when running 'ls' and send kernel log to us.

echo module ceph +p > /sys/kernel/debug/dynamic_debug/control;
time /mnt/cephfs_mountpoint/test1
echo module ceph -p > /sys/kernel/debug/dynamic_debug/control;

Yan, Zheng

> >>
> >> Test2:
> >>
> >>
> >> 1) unmount & remount
> >>
> >>
> >> 2) Add 10 x 100GB files to a directory
> >>
> >> for i in {1..10}; do dd if=/dev/zero of=/mnt/cephfs_mountpoint/file$i.txt count=102400 bs=1048576; done
> >>
> >>
> >> 3) Either a) unmount & remount; or b) drop caches
> >>
> >>
> >> echo 3 >>/proc/sys/vm/drop_caches
> >>
> >>
> >> 4) Delete files in directory:
> >>
> >>
> >> for i in {1..10};do rm -f /mnt/cephfs_mountpoint/file$i.txt; done
> >>
> >>
> >> 5) Immediately perform ls on directory:
> >>
> >>
> >> time ls /mnt/cephfs_mountpoint/test1
> >>
> >>
> >> Result: no delay
> >>
> >> real    0m0.010s
> >>
> >> user    0m0.000s
> >>
> >> sys     0m0.001s
> >>
> >>
> >> Our understanding of ceph-fs’ file deletion mechanism, is that there should be no blocking observed on the client. http://docs.ceph.com/docs/mimic/dev/delayed-delete/ .
> >>
> >> It appears that if files are cached on the client, either by being created or accessed recently  it will cause the kernel client to block for reasons we have not identified.
> >>
> >>
> >> Is this a known issue, are there any ways to mitigate this behaviour?
> >>
> >> Our production system relies on our client’s processes having concurrent access to the file system, and access contention must be avoided.
> >>
> >>
> >> An old mailing list post that discusses changes to client’s page cache behaviour may be relevant.
> >>
> >> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-October/005692.html
> >>
> >>
> >> Client System:
> >>
> >>
> >> OS: RHEL7
> >>
> >> Kernel: 4.15.15-1
> >>
> >>
> >> Cluster: Ceph: Luminous 12.2.8
> >>
> >>
>
>
>
> >> Thanks,
> >>
> >> Dylan
> >>
> >>
> >> _______________________________________________
> >> ceph-users mailing list
> >> ceph-users@xxxxxxxxxxxxxx
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux