CephFS thrashing through the page cache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


We have an internal use case where we back the storage of a proprietary
database by a shared file system. We noticed something very odd when
testing some workload with a local block device backed file system vs
cephfs. We noticed that the amount of network IO done by cephfs is almost
double compared to the IO done in case of a local file system backed by an
attached block device.

We also noticed that CephFS thrashes through the page cache very quickly
compared to the amount of data being read and think that the two issues
might be related. So, I wrote a simple test.

1. I wrote 10k files 400KB each using dd (approx 4 GB data).
2. I dropped the page cache completely.
3. I then read these files serially, again using dd. The page cache usage
shot up to 39 GB for reading such a small amount of data.

Following is the code used to repro this in bash:

for i in $(seq 1 10000); do
  dd if=/dev/zero of=test_${i} bs=4k count=100

sync; echo 1 > /proc/sys/vm/drop_caches

for i in $(seq 1 10000); do
  dd if=test_${i} of=/dev/null bs=4k count=100

The ceph version being used is:
ceph version 15.2.13 (c44bc49e7a57a87d84dfff2a077a2058aa2172e2) octopus

The ceph configs being overriden:
WHO       MASK  LEVEL     OPTION                                 VALUE
  mon           advanced  auth_allow_insecure_global_id_reclaim  false

  mgr           advanced  mgr/balancer/mode                      upmap

  mgr           advanced  mgr/dashboard/server_addr    
  mgr           advanced  mgr/dashboard/server_port              8443
  mgr           advanced  mgr/dashboard/ssl                      false
  mgr           advanced  mgr/prometheus/server_addr   
  mgr           advanced  mgr/prometheus/server_port             9283
  osd           advanced  bluestore_compression_algorithm        lz4

  osd           advanced  bluestore_compression_mode             aggressive

  osd           advanced  bluestore_throttle_bytes               536870912

  osd           advanced  osd_max_backfills                      3

  osd           advanced  osd_op_num_threads_per_shard_ssd       8
  osd           advanced  osd_scrub_auto_repair                  true

  mds           advanced  client_oc                              false

  mds           advanced  client_readahead_max_bytes             4096

  mds           advanced  client_readahead_max_periods           1

  mds           advanced  client_readahead_min                   0

  mds           basic     mds_cache_memory_limit
  client        advanced  client_oc                              false

  client        advanced  client_readahead_max_bytes             4096

  client        advanced  client_readahead_max_periods           1

  client        advanced  client_readahead_min                   0

  client        advanced  fuse_disable_pagecache                 false

The cephfs mount options (note that readahead was disabled for this test):
/mnt/cephfs type ceph (rw,relatime,name=cephfs,secret=<hidden>,acl,rasize=0)

Any help or pointers are appreciated; this is a major performance issue for

Thanks and Regards,
Ashu Pachauri
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]

  Powered by Linux