This is a follow-up of the patch series [PATCH v5 0/2] ext4: Improve parallel I/O performance on NVDIMM https://lkml.org/lkml/2016/4/29/583 It is rebased to the latest 4.7-rc1 release. It has an additional patch to advantage of the fact that the inode i_mutex is now an i_rwsem. Patch 1 changes the locking in dax_do_io() to get a shared lock instead of an exclusive lock for reading. That allows parallel reads to happen. Patch 2 converts some ext4 statistics counts into percpu counts to reduce cacheline contention for parallel reads. Patch 3 passes in the DIO_SKIP_DIO_COUNT flag to dax_do_io() as DAX I/Os are synchronous and there is no need to update DIO count as long as either the lock is taken or the count has been updated in the caller. Waiman Long (3): dax: Take shared lock in dax_do_io() ext4: Make cache hits/misses per-cpu counts ext4: Pass DIO_SKIP_DIO_COUNT to dax_do_io fs/dax.c | 9 +++++---- fs/ext4/extents_status.c | 38 +++++++++++++++++++++++++++++--------- fs/ext4/extents_status.h | 4 ++-- fs/ext4/inode.c | 24 ++++++++++++++++++------ 4 files changed, 54 insertions(+), 21 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html