On Thu, Jun 9, 2022 at 11:56 AM Xiubo Li <xiubli@xxxxxxxxxx> wrote: > > > On 6/9/22 11:29 AM, Yan, Zheng wrote: > > On Thu, Jun 9, 2022 at 11:19 AM Xiubo Li <xiubli@xxxxxxxxxx> wrote: > >> > >> On 6/9/22 10:15 AM, Yan, Zheng wrote: > >>> The recent series of patches that add "wait on async xxxx" at various > >>> places do not seem correct. The correct fix should make mds avoid any > >>> wait when handling async requests. > >>> > >> In this case I am thinking what will happen if the async create request > >> is deferred, then the cap flush related request should fail to find the > >> ino. > >> > >> Should we wait ? Then how to distinguish from migrating a subtree and a > >> deferred async create cases ? > >> > > async op caps are revoked at freezingtree stage of subtree migration. > > see Locker::invalidate_lock_caches > > > Sorry I may not totally understand this issue. > > I think you mean in case of migration and then the MDS will revoke caps > for the async create files and then the kclient will send a MclientCap > request to mds, right ? > > If my understanding is correct, there is another case that: > > 1, async create a fileA > > 2, then write a lot of data to it and then release the Fw cap ref, and > if we should report the size to MDS, it will send a MclientCap request > to MDS too. > > 3, what if the async create of fileA was deferred due to some reason, > then the MclientCap request will fail to find the ino ? > Async op should not be deferred in any case. > > >>> On Wed, Jun 8, 2022 at 12:56 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote: > >>>> Currently, we'll call ceph_check_caps, but if we're still waiting on the > >>>> reply, we'll end up spinning around on the same inode in > >>>> flush_dirty_session_caps. Wait for the async create reply before > >>>> flushing caps. > >>>> > >>>> Fixes: fbed7045f552 (ceph: wait for async create reply before sending any cap messages) > >>>> URL: https://tracker.ceph.com/issues/55823 > >>>> Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx> > >>>> --- > >>>> fs/ceph/caps.c | 1 + > >>>> 1 file changed, 1 insertion(+) > >>>> > >>>> I don't know if this will fix the tx queue stalls completely, but I > >>>> haven't seen one with this patch in place. I think it makes sense on its > >>>> own, either way. > >>>> > >>>> diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c > >>>> index 0a48bf829671..5ecfff4b37c9 100644 > >>>> --- a/fs/ceph/caps.c > >>>> +++ b/fs/ceph/caps.c > >>>> @@ -4389,6 +4389,7 @@ static void flush_dirty_session_caps(struct ceph_mds_session *s) > >>>> ihold(inode); > >>>> dout("flush_dirty_caps %llx.%llx\n", ceph_vinop(inode)); > >>>> spin_unlock(&mdsc->cap_dirty_lock); > >>>> + ceph_wait_on_async_create(inode); > >>>> ceph_check_caps(ci, CHECK_CAPS_FLUSH, NULL); > >>>> iput(inode); > >>>> spin_lock(&mdsc->cap_dirty_lock); > >>>> -- > >>>> 2.36.1 > >>>> >