From: Dave Chinner <dchinner@xxxxxxxxxx> When running a "create millions inodes in a directory" test recently, I noticed we were spending a huge amount of time converting freespace block headers from disk format to in-memory format: 31.47% [kernel] [k] xfs_dir2_node_addname 17.86% [kernel] [k] xfs_dir3_free_hdr_from_disk 3.55% [kernel] [k] xfs_dir3_free_bests_p We shouldn't be hitting the best free block scanning code so hard when doing sequential directory creates, and it turns out there's a highly suboptimal loop searching the the best free array in the freespace block - it decodes the block header before checking each entry inside a loop, instead of decoding the header once before running the entry search loop. This makes a massive difference to create rates. Profile now looks like this: 13.15% [kernel] [k] xfs_dir2_node_addname 3.52% [kernel] [k] xfs_dir3_leaf_check_int 3.11% [kernel] [k] xfs_log_commit_cil And the wall time/average file create rate differences are just as stark: create time(sec) / rate (files/s) File count vanilla patched 10k 0.54 / 18.5k 0.53 / 18.9k 20k 1.10 / 18.1k 1.05 / 19.0k 100k 4.21 / 23.8k 3.91 / 25.6k 200k 9.66 / 20,7k 7.37 / 27.1k 1M 86.61 / 11.5k 48.26 / 20.7k 2M 206.13 / 9.7k 129.71 / 15.4k The larger the directory, the bigger the performance improvement. Signed-Off-By: Dave Chinner <dchinner@xxxxxxxxxx> --- fs/xfs/libxfs/xfs_dir2_node.c | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c index 682e2bf370c7..bcf0d43cd6a8 100644 --- a/fs/xfs/libxfs/xfs_dir2_node.c +++ b/fs/xfs/libxfs/xfs_dir2_node.c @@ -1829,24 +1829,24 @@ xfs_dir2_node_addname_int( */ bests = dp->d_ops->free_bests_p(free); dp->d_ops->free_hdr_from_disk(&freehdr, free); - if (be16_to_cpu(bests[findex]) != NULLDATAOFF && - be16_to_cpu(bests[findex]) >= length) - dbno = freehdr.firstdb + findex; - else { - /* - * Are we done with the freeblock? - */ - if (++findex == freehdr.nvalid) { - /* - * Drop the block. - */ - xfs_trans_brelse(tp, fbp); - fbp = NULL; - if (fblk && fblk->bp) - fblk->bp = NULL; + do { + + if (be16_to_cpu(bests[findex]) != NULLDATAOFF && + be16_to_cpu(bests[findex]) >= length) { + dbno = freehdr.firstdb + findex; + break; } + } while (++findex < freehdr.nvalid); + + /* Drop the block if we done with the freeblock */ + if (findex == freehdr.nvalid) { + xfs_trans_brelse(tp, fbp); + fbp = NULL; + if (fblk) + fblk->bp = NULL; } } + /* * If we don't have a data block, we need to allocate one and make * the freespace entries refer to it. -- 2.15.0 -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html