Re: PROBLEM: CephFS write performance drops by 90%

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Dec 15, 2022 at 3:22 PM Roose, Marco <marco.roose@xxxxxxxxxxxxx> wrote:
>
> Dear Ilya,
> I'm using Ubuntu and a CephFS mount. I had a more than 90% write performance decrease after changing the kernels main version ( <10MB/s vs. 100-140 MB/s). The problem seems to exist in Kernel major versions starting at v5.4. Ubuntu mainline version v5.4.25 is fine, v5.4.45 (which is next available) is "bad".

Hi Marco,

What is the workload?

>
> After a git bisect with the "original" 5.4 kernels I get the following result:

Can you describe how you performed bisection?  Can you share the
reproducer you used for bisection?

>
> ed24820d1b0cbe8154c04189a44e363230ed647e is the first bad commit
> commit ed24820d1b0cbe8154c04189a44e363230ed647e
> Author: Ilya Dryomov <idryomov@xxxxxxxxx>
> Date:   Mon Mar 9 12:03:14 2020 +0100
>
>     ceph: check POOL_FLAG_FULL/NEARFULL in addition to OSDMAP_FULL/NEARFULL
>
>     commit 7614209736fbc4927584d4387faade4f31444fce upstream.
>
>     CEPH_OSDMAP_FULL/NEARFULL aren't set since mimic, so we need to consult
>     per-pool flags as well.  Unfortunately the backwards compatibility here
>     is lacking:
>
>     - the change that deprecated OSDMAP_FULL/NEARFULL went into mimic, but
>       was guarded by require_osd_release >= RELEASE_LUMINOUS
>     - it was subsequently backported to luminous in v12.2.2, but that makes
>       no difference to clients that only check OSDMAP_FULL/NEARFULL because
>       require_osd_release is not client-facing -- it is for OSDs
>
>     Since all kernels are affected, the best we can do here is just start
>     checking both map flags and pool flags and send that to stable.
>
>     These checks are best effort, so take osdc->lock and look up pool flags
>     just once.  Remove the FIXME, since filesystem quotas are checked above
>     and RADOS quotas are reflected in POOL_FLAG_FULL: when the pool reaches
>     its quota, both POOL_FLAG_FULL and POOL_FLAG_FULL_QUOTA are set.

The only suspicious thing I see in this commit is osdc->lock semaphore
which is taken for read for a short period of time in ceph_write_iter().
It's possible that that started interfering with other code paths that
take that semaphore for write and read-write lock fairness algorithm is
biting...

Can you confirm the result by manually checking out the previous commit
and verifying that it's "good"?

    commit 44960e1c39d807cd0023dc7036ee37f105617ebe
    RDMA/mad: Do not crash if the rdma device does not have a umad interface
        (commit 5bdfa854013ce4193de0d097931fd841382c76a7 upstream)

Thanks,

                Ilya




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux