Re: [PATCH] Revert "ceph: enable async dirops by default"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 10/4/23 21:15, Venky Shankar wrote:
Hi Ilya,

After some digging and talking to Jeff, I figured that it's possible
to disable async dirops from the mds side by setting
`mds_client_delegate_inos_pct` config to 0:

         - name: mds_client_delegate_inos_pct
           type: uint
           level: advanced
           desc: percentage of preallocated inos to delegate to client
           default: 50
           services:
           - mds

So, I guess this patch is really not required. We can suggest this
config update to users and document it for now. We lack tests with
this config disabled, so I'll be adding the same before recommending
it out. Will keep you posted.

Venky, Jeff

This option could disable the async create operation but could it also disable the async unlink operation ?

Thanks

- Xiubo


On Wed, Oct 4, 2023 at 10:16 AM Venky Shankar <vshankar@xxxxxxxxxx> wrote:
Hi Ilya,

On Tue, Oct 3, 2023 at 5:00 PM Ilya Dryomov <idryomov@xxxxxxxxx> wrote:
On Tue, Oct 3, 2023 at 1:06 PM Venky Shankar <vshankar@xxxxxxxxxx> wrote:
From: Xiubo Li <xiubli@xxxxxxxxxx>

This reverts commit f7a67b463fb83a4b9b11ceaa8ec4950b8fb7f902.

We have identified an issue in the MDS affecting CephFS users using
the kernel driver. The issue was first introduced in the octopus
release that added support for clients to perform asynchronous
directory operations using the `nowsync` mount option. The issue
presents itself as an MDS crash resembling (any of) the following
crashes:

         https://tracker.ceph.com/issues/61009
         https://tracker.ceph.com/issues/58489

There is no apparent data loss or corruption, but since the underlying
cause is related to an (operation) ordering issue, the extent of the
problem could surface in other forms - most likely MDS crashes
involving preallocated inodes.

The fix is being reviewed and is being worked on priority:

         https://github.com/ceph/ceph/pull/53752

As a workaround, we recommend (kernel) clients be remounted with the
`wsync` mount option which disables asynchronous directory operations
(depending on the kernel version being used, the default could be
`nowsync`).

This change reverts the default, so, async dirops is disabled (by default).
Hi Venky,

Given that the fix is now up and being reviewed on priority, does it
still make sense to change the default?

According to Xiubo, https://tracker.ceph.com/issues/58489 which morphed
into https://tracker.ceph.com/issues/61009 isn't the only concern -- he
also brought up https://tracker.ceph.com/issues/62810.  If the move to
revert (change of default) is also prompted by that issue, it should be
described in the patch.
Fair enough -- I'll push out with an updated commit message.

Thanks,

                 Ilya


--
Cheers,
Venky






[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux