Re: PROBLEM: CephFS write performance drops by 90%

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Roose,

I think this should be similar with https://tracker.ceph.com/issues/57898, but it's from 5.15 instead.

Days ago just after Ilya rebased to the 6.1 without changing anything in ceph code the xfstest tests were much faster.

I just checked the difference about the 5.4 and 5.4.45 and couldn't know what has happened exactly. So please share your test case about this.

- Xiubo

On 15/12/2022 23:32, Ilya Dryomov wrote:
On Thu, Dec 15, 2022 at 3:22 PM Roose, Marco <marco.roose@xxxxxxxxxxxxx> wrote:
Dear Ilya,
I'm using Ubuntu and a CephFS mount. I had a more than 90% write performance decrease after changing the kernels main version ( <10MB/s vs. 100-140 MB/s). The problem seems to exist in Kernel major versions starting at v5.4. Ubuntu mainline version v5.4.25 is fine, v5.4.45 (which is next available) is "bad".
Hi Marco,

What is the workload?

After a git bisect with the "original" 5.4 kernels I get the following result:
Can you describe how you performed bisection?  Can you share the
reproducer you used for bisection?

ed24820d1b0cbe8154c04189a44e363230ed647e is the first bad commit
commit ed24820d1b0cbe8154c04189a44e363230ed647e
Author: Ilya Dryomov <idryomov@xxxxxxxxx>
Date:   Mon Mar 9 12:03:14 2020 +0100

     ceph: check POOL_FLAG_FULL/NEARFULL in addition to OSDMAP_FULL/NEARFULL

     commit 7614209736fbc4927584d4387faade4f31444fce upstream.

     CEPH_OSDMAP_FULL/NEARFULL aren't set since mimic, so we need to consult
     per-pool flags as well.  Unfortunately the backwards compatibility here
     is lacking:

     - the change that deprecated OSDMAP_FULL/NEARFULL went into mimic, but
       was guarded by require_osd_release >= RELEASE_LUMINOUS
     - it was subsequently backported to luminous in v12.2.2, but that makes
       no difference to clients that only check OSDMAP_FULL/NEARFULL because
       require_osd_release is not client-facing -- it is for OSDs

     Since all kernels are affected, the best we can do here is just start
     checking both map flags and pool flags and send that to stable.

     These checks are best effort, so take osdc->lock and look up pool flags
     just once.  Remove the FIXME, since filesystem quotas are checked above
     and RADOS quotas are reflected in POOL_FLAG_FULL: when the pool reaches
     its quota, both POOL_FLAG_FULL and POOL_FLAG_FULL_QUOTA are set.
The only suspicious thing I see in this commit is osdc->lock semaphore
which is taken for read for a short period of time in ceph_write_iter().
It's possible that that started interfering with other code paths that
take that semaphore for write and read-write lock fairness algorithm is
biting...

Can you confirm the result by manually checking out the previous commit
and verifying that it's "good"?

     commit 44960e1c39d807cd0023dc7036ee37f105617ebe
     RDMA/mad: Do not crash if the rdma device does not have a umad interface
         (commit 5bdfa854013ce4193de0d097931fd841382c76a7 upstream)

Thanks,

                 Ilya





[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux