Hi Greg, Nowhere in your test procedure do you mention syncing or flushing the files to disk. That is almost certainly the cause of the slowness We have tested performing sync after file creation and the delay still occurs. (See Test3 results below) To clarify, it appears the delay is observed only when ls is performed on the same directory in which the files were removed, provided the files have been recently cached. e.g. rm -f /mnt/cephfs_mountpoint/file*; ls /mnt/cephfs_mountpoint the client which wrote the data is required to flush it out before dropping enough file "capabilities" for the other client to do the rm. Our tests are performed on the same host. In Test1 the rm and ls are performed by the same client id. And for other tests in which an unmount & remount were performed, I would assume the unmount would cause that particular client id to terminate and drop any caps. Do you still believe held caps are contributing to slowness in these test scenarios? We’ve added 3 additional test cases below. Test 3) Sync write (delay observed when writing files and syncing) Test 4) Bypass cache (no delay observed when files are not written to cache) Test 5) Read test (delay observed when removing files that have been read recently in to cache) Test3: Sync Write - File creation, with sync after write. 1) unmount & remount: 2) Add 5 x 100GB files to a directory: for i in {1..5}; do dd if=/dev/zero of=/mnt/cephfs_mountpoint/file$i.txt count=102400 bs=1048576;done 3) sync 4) Delete all files in directory: for i in {1..5};do rm -f /mnt/cephfs_mountpoint/file$i.txt; done 5) Immediately perform ls on directory: time ls /mnt/cephfs_mountpoint real 0m8.765s user 0m0.001s sys 0m0.000s Test4: Bypass cache - File creation, with nocache options for dd. 1) unmount & remount: 2) Add 5 x 100GB files to a directory: for i in {1..5}; do dd if=/dev/zero of=/mnt/cephfs_mountpoint/file$i.txt count=102400 bs=1048576 oflag=nocache,sync iflag=nocache;done 3) sync 4) Delete all files in directory: for i in {1..5};do rm -f /mnt/cephfs_mountpoint/file$i.txt; done 5) Immediately perform ls on directory: time ls /mnt/cephfs_mountpoint real 0m0.003s user 0m0.000s sys 0m0.001s Test5: Read test - Read files into empty page cache, before deletion. 1) unmount & remount 2) Add 5 x 100GB files to a directory: for i in {1..5}; do dd if=/dev/zero of=/mnt/cephfs_mountpoint/file$i.txt count=102400 bs=1048576;done 3) sync 4) unmount & remount #empty cache 5) read files (to add back to cache) for i in {1..5};do cat /mnt/cephfs_mountpoint/file$i.txt > /dev/null; done 6) Delete all files in directory: for i in {1..5};do rm -f /mnt/cephfs_mountpoint/file$i.txt; done 5) Immediately perform ls on directory: time ls /mnt/cephfs_mountpoint real 0m8.723s user 0m0.000s sys 0m0.001s Regards, Dylan From: Gregory Farnum <gfarnum@xxxxxxxxxx>
Sent: Wednesday, October 10, 2018 4:37:49 AM To: Dylan McCulloch Cc: ceph-users@xxxxxxxxxxxxxx Subject: Re: cephfs kernel client blocks when removing large files Nowhere in your test procedure do you mention syncing or flushing the files to disk. That is almost certainly the cause of the slowness — the client which wrote the data is required to flush it out before dropping enough file "capabilities" for
the other client to do the rm.
-Greg
On Sun, Oct 7, 2018 at 11:57 PM Dylan McCulloch <dmc@xxxxxxxxxxxxxx> wrote:
|
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com