OK, who wrote this code? He did not stop to think for even a second. And surely it was never tested, since it is 100% repeatable. Smack yourself on the head! Simple: layout is in use, a recall comes in. _DELAY is returned meanwhile, a flag is set and no new IOs are allowed. The last IO dereference the layout, there are no more layouts. The server persists, sends a second recall, boom below code crash. * While at it fix an if();else {} style violation * The added BUG_ON is important because the address taken from a member of a null pointer is not null, and the crash is not at all clean. It no longer crashes my trunc_test, Now I need to verify the recall is actually honoured. Signed-off-by: Boaz Harrosh <Boaz Harrosh bharrosh@xxxxxxxxxxx> --- fs/nfs/callback_proc.c | 9 +++++++-- 1 files changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c index 12ab7b3..34aee16 100644 --- a/fs/nfs/callback_proc.c +++ b/fs/nfs/callback_proc.c @@ -252,9 +252,9 @@ static int initiate_layout_draining(struct pnfs_cb_lrecall_info *cb_info) if (nfs_compare_fh(&args->cbl_fh, &NFS_I(lo->plh_inode)->fh)) continue; - if (test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags)) + if (test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags)) { rv = NFS4ERR_DELAY; - else { + } else { /* FIXME I need to better understand igrab and * does having a layout ref keep ino around? * It should. @@ -270,6 +270,11 @@ static int initiate_layout_draining(struct pnfs_cb_lrecall_info *cb_info) } spin_unlock(&clp->cl_lock); + if (rv == NFS4ERR_NOMATCHING_LAYOUT) + /* Nothing found */ + return rv; + + BUG_ON(!lo->plh_inode); spin_lock(&lo->plh_inode->i_lock); if (rv == NFS4_OK) { lo->plh_block_lgets++; -- 1.7.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html