@@ -1376,7 +1376,8 @@ static void pnfs_list_write_lseg(struct inode *inode, struct list_head *listp)
list_for_each_entry(lseg,&NFS_I(inode)->layout->plh_segs, pls_list) {
if (lseg->pls_range.iomode == IOMODE_RW&&
- test_bit(NFS_LSEG_LAYOUTCOMMIT,&lseg->pls_flags))
+ test_bit(NFS_LSEG_LAYOUTCOMMIT,&lseg->pls_flags)&&
+ list_empty(&lseg->pls_lc_list))
list_add(&lseg->pls_lc_list, listp);
}
}
If the lseg is already part of one layoutcommit, but we're sending a
second one for the same range (presumably because we wrote more data in
the same region), then the above causes the lseg to be excluded.
Yes, lseg is excluded, This patch does exactly only exclusion of lseg.
lseg is used here only to get/put reference on this lseg, so skipping is
correct.
However, checking on list_empty can occur (on other CPU) in the middle:
list_del_init(&lseg->pls_lc_list);
Here >>>>>>
if (test_and_clear_bit(NFS_LSEG_LAYOUTCOMMIT,
&lseg->pls_flags))
put_lseg(lseg);
So list_del_init must be executed under the same lock as
pnfs_list_write_lseg, i.e. inode->i_lock.
I agree that the current code causes list corruption, but before
applying something like the above patch, I'd like to understand why it
is correct.
Trond
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html