On Wed, Jul 01, 2015 at 11:26:32AM +1000, Dave Chinner wrote: > On Thu, Jun 25, 2015 at 04:39:30PM -0700, Darrick J. Wong wrote: > > When we're unmapping blocks from a file, we need to decrease refcounts > > in the btree and only free blocks if they refcount is 1. > > > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > --- > > fs/xfs/libxfs/xfs_bmap.c | 5 + > > fs/xfs/libxfs/xfs_reflink_btree.c | 140 +++++++++++++++++++++++++++++++++++++ > > fs/xfs/libxfs/xfs_reflink_btree.h | 4 + > > 3 files changed, 147 insertions(+), 2 deletions(-) > > > > > > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c > > index 057fa9a..3f5e8da 100644 > > --- a/fs/xfs/libxfs/xfs_bmap.c > > +++ b/fs/xfs/libxfs/xfs_bmap.c > > @@ -45,6 +45,7 @@ > > #include "xfs_symlink.h" > > #include "xfs_attr_leaf.h" > > #include "xfs_filestream.h" > > +#include "xfs_reflink_btree.h" > > > > > > kmem_zone_t *xfs_bmap_free_item_zone; > > @@ -4984,8 +4985,8 @@ xfs_bmap_del_extent( > > * If we need to, add to list of extents to delete. > > */ > > if (do_fx) > > - xfs_bmap_add_free(mp, flist, del->br_startblock, > > - del->br_blockcount, ip->i_ino); > > + xfs_reflink_bmap_add_free(mp, flist, del->br_startblock, > > + del->br_blockcount, ip->i_ino, tp); > > I think this is the wrong abstraction. I think the code should look > like this: > > if (do_fx) { > if (xfs_sb_version_hasreflink(&mp->m_sb)) { > error = xfs_reflink_del_extent(mp, tp, flist, > del->br_startblock, > del->br_blockcount, ip->i_ino); > if (error) > goto done; > } else > xfs_bmap_add_free() > } > > Because what we are doing is deleting an extent from the reflink > btree, not adding a freed extent to the "to-be-freed" list. <nod> Not a great choice of name, I agree... > > > > diff --git a/fs/xfs/libxfs/xfs_reflink_btree.c b/fs/xfs/libxfs/xfs_reflink_btree.c > > index 380ed72..f40ba1f 100644 > > --- a/fs/xfs/libxfs/xfs_reflink_btree.c > > +++ b/fs/xfs/libxfs/xfs_reflink_btree.c > > Again, xfs_reflink.c > > > @@ -935,3 +935,143 @@ error0: > > xfs_btree_del_cursor(cur, XFS_BTREE_ERROR); > > return error; > > } > > + > > +/** > > + * xfs_reflink_bmap_add_free() - release a range of blocks > > + * > > + * @mp: XFS mount object > > + * @flist: List of blocks to be freed at the end of the transaction > > + * @fsbno: First fs block of the range to release > > + * @len: Length of range > > + * @owner: owner of the extent > > + * @tp: transaction that goes with the free operation > > + */ > > +int > > +xfs_reflink_bmap_add_free( > > + struct xfs_mount *mp, /* mount point structure */ > > + xfs_bmap_free_t *flist, /* list of extents */ > > + xfs_fsblock_t fsbno, /* fs block number of extent */ > > + xfs_filblks_t fslen, /* length of extent */ > > + uint64_t owner, /* extent owner */ > > + struct xfs_trans *tp) /* transaction */ > > +{ > > + struct xfs_btree_cur *cur; > > + int error; > > + struct xfs_buf *agbp; > > + xfs_agnumber_t agno; /* allocation group number */ > > + xfs_agblock_t agbno; /* ag start of range to free */ > > + xfs_agblock_t agbend; /* ag end of range to free */ > > + xfs_extlen_t aglen; /* ag length of range to free */ > > + int i, have; > > + xfs_agblock_t lbno; /* rlextent start */ > > + xfs_extlen_t llen; /* rlextent length */ > > + xfs_nlink_t lnr; /* rlextent refcount */ > > + xfs_agblock_t bno; /* rlext block # in loop */ > > + xfs_extlen_t len; /* rlext length in loop */ > > + unsigned long long blocks_freed; > > + xfs_fsblock_t range_fsb; > > + > > + if (!xfs_sb_version_hasreflink(&mp->m_sb)) { > > + xfs_bmap_add_free(mp, flist, fsbno, fslen, owner); > > + return 0; > > + } > > That canbe dropped. > > + > > + agno = XFS_FSB_TO_AGNO(mp, fsbno); > > + agbno = XFS_FSB_TO_AGBNO(mp, fsbno); > > + CHECK_AG_NUMBER(mp, agno); > > + ASSERT(fslen < mp->m_sb.sb_agblocks); > > + CHECK_AG_EXTENT(mp, agbno, fslen); > > These extent lengths have already been checked. If they are invalid, > then the extent deletion would have errored out with corruption > long before we get here. Ok. > > + aglen = fslen; > > + > > + /* > > + * Drop reference counts in the reflink tree. > > + */ > > + error = xfs_alloc_read_agf(mp, tp, agno, 0, &agbp); > > + if (error) > > + return error; > > + > > + /* > > + * Grab a rl btree cursor. > > + */ > > + cur = xfs_reflinkbt_init_cursor(mp, tp, agbp, agno); > > + bno = agbno; > > + len = aglen; > > + agbend = agbno + aglen - 1; > > + blocks_freed = 0; > > + > > + /* > > + * Account for a left extent that partially covers our range. > > + */ > > + error = xfs_reflink_lookup_le(cur, bno, &have); > > + if (error) > > + goto error0; > > + if (have) { > > + error = xfs_reflink_get_rec(cur, &lbno, &llen, &lnr, &i); > > + if (error) > > + goto error0; > > + XFS_WANT_CORRUPTED_RLEXT_GOTO(mp, i, lbno, llen, lnr, error0); > > + if (lbno + llen > bno) { > > + blocks_freed += min(len, lbno + llen - bno); > > + bno += blocks_freed; > > + len -= blocks_freed; > > + } > > + } > > So we unconditionally look up the reflink btree on extent free to > see if we need to free it, even if the inode has not been reflinked? > Doesn't this add a lot of overhead to the extent freeing? > > Indeed, why not just mark inodes that have been reflinked (i.e. have > shared extents) with an on-disk flag so that we know if we need to > do reflink btree work or not? That way the code fragment above could > just check an inode flag rather than always calling into this > function for reflink enabled filesystems.... Yep, the inode flag comes later, though I'm melding it into an earlier part of the patch... > > > + while (len > 0) { > > + /* > > + * Go find the next rlext. > > + */ > > + range_fsb = XFS_AGB_TO_FSB(mp, agno, bno); > > + error = xfs_btree_increment(cur, 0, &have); > > + if (error) > > + goto error0; > > + if (!have) { > > + /* > > + * There's no right rlextent, so free bno to the end. > > + */ > > + lbno = bno + len; > > + llen = 0; > > + } else { > > + /* > > + * Find the next rlextent. > > + */ > > + error = xfs_reflink_get_rec(cur, &lbno, &llen, > > + &lnr, &i); > > + if (error) > > + goto error0; > > + XFS_WANT_CORRUPTED_RLEXT_GOTO(mp, i, lbno, llen, lnr, > > + error0); > > + if (lbno >= bno + len) { > > + lbno = bno + len; > > + llen = 0; > > + } > > + } > > + > > + /* > > + * Free everything up to the start of the rlextent and > > + * account for still-mapped blocks. > > + */ > > + if (lbno - bno > 0) { > > + xfs_bmap_add_free(mp, flist, range_fsb, lbno - bno, > > + owner); > > + len -= lbno - bno; > > + bno += lbno - bno; > > + } > > + llen = min(llen, agbend + 1 - lbno); > > + blocks_freed += llen; > > + len -= llen; > > + bno += llen; > > + } > > + > > + xfs_btree_del_cursor(cur, XFS_BTREE_NOERROR); > > + > > + error = xfs_reflinkbt_adjust_refcount(mp, tp, agbp, agno, agbno, aglen, > > + -1); > > Hmmm - we just walked the btree to determine what extents to > free, and now we are going to walk the btree again to drop the > reference counts on shared extents? So every extent that gets freed > does two walks of the reflink btree regardless of the whether it has > shared blocks or not? Yeah, it would be more efficient to bundle the xfs_bmap_add_free loop into adjust_refcount() so that we only have to make one pass. --D > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx > > _______________________________________________ > xfs mailing list > xfs@xxxxxxxxxxx > http://oss.sgi.com/mailman/listinfo/xfs _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs