On Sat, 2022-01-29 at 11:14 +0800, Yan, Zheng wrote: > On Sat, Jan 29, 2022 at 2:32 AM Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > > > I've noticed an intermittent hang waiting for caps in some testing. What > > I see is that the client will try to get caps for an operation (e.g. a > > read), and ends up waiting on the waitqueue forever. The caps debugfs > > file however shows that the caps it's waiting on have already been > > granted. > > > > The current grant handling code will wake the waitqueue when it sees > > that there are newly-granted caps in the issued set. On an import > > however, we'll end up adding a new cap first, which fools the logic into > > thinking that nothing has changed. A later hack in the code works around > > this, but only for auth caps. > > not right. handle_cap_import() saves old issued to extra_info->issued. > It does save the old issued value to extra_info->issued, but handle_cap grant consults cap->issued for most of its logic. It does check against extra_info->issued for auth caps to determine whether to wake waiters, but doesn't do that for non-auth caps. This patch corrects that. If you think this patch isn't right, can you elaborate on why and what would make it correct? Thanks, Jeff > > > > Ensure we wake the waiters whenever we get an IMPORT that grants new > > caps for the inode. > > > > URL: https://tracker.ceph.com/issues/54044 > > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx> > > --- > > fs/ceph/caps.c | 23 ++++++++++++----------- > > 1 file changed, 12 insertions(+), 11 deletions(-) > > > > I'm still testing this patch, but I think this may be the cause of some > > mysterious hangs I've hit in testing. > > > > diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c > > index e668cdb9c99e..06b65a68e920 100644 > > --- a/fs/ceph/caps.c > > +++ b/fs/ceph/caps.c > > @@ -3541,21 +3541,22 @@ static void handle_cap_grant(struct inode *inode, > > fill_inline = true; > > } > > > > - if (ci->i_auth_cap == cap && > > - le32_to_cpu(grant->op) == CEPH_CAP_OP_IMPORT) { > > + if (le32_to_cpu(grant->op) == CEPH_CAP_OP_IMPORT) { > > if (newcaps & ~extra_info->issued) > > wake = true; > > > > - if (ci->i_requested_max_size > max_size || > > - !(le32_to_cpu(grant->wanted) & CEPH_CAP_ANY_FILE_WR)) { > > - /* re-request max_size if necessary */ > > - ci->i_requested_max_size = 0; > > - wake = true; > > - } > > + if (ci->i_auth_cap == cap) { > > + if (ci->i_requested_max_size > max_size || > > + !(le32_to_cpu(grant->wanted) & CEPH_CAP_ANY_FILE_WR)) { > > + /* re-request max_size if necessary */ > > + ci->i_requested_max_size = 0; > > + wake = true; > > + } > > > > - ceph_kick_flushing_inode_caps(session, ci); > > - spin_unlock(&ci->i_ceph_lock); > > - up_read(&session->s_mdsc->snap_rwsem); > > + ceph_kick_flushing_inode_caps(session, ci); > > + spin_unlock(&ci->i_ceph_lock); > > + up_read(&session->s_mdsc->snap_rwsem); > > + } > > } else { > > spin_unlock(&ci->i_ceph_lock); > > } > > -- > > 2.34.1 > > -- Jeff Layton <jlayton@xxxxxxxxxx>