On Mon, 18 July 2011 20:57:49 +1000, Dave Chinner wrote: > On Mon, Jul 18, 2011 at 09:53:39AM +0200, Jörn Engel wrote: > > On Mon, 18 July 2011 09:32:52 +1000, Dave Chinner wrote: > > > On Sun, Jul 17, 2011 at 06:05:01PM +0200, Jörn Engel wrote: > > > > > xfs: > > > > ==== > > > > seqrd 1 2 4 8 16 32 64 128 > > > > 16384 4698 4424 4397 4402 4394 4398 4642 4679 > > > > 8192 6234 5827 5797 5801 5795 6114 5793 5812 > > > > 4096 9100 8835 8882 8896 8874 8890 8910 8906 > > > > 2048 14922 14391 14259 14248 14264 14264 14269 14273 > > > > 1024 23853 22690 22329 22362 22338 22277 22240 22301 > > > > 512 37353 33990 33292 33332 33306 33296 33224 33271 seqrd 1 2 4 8 16 32 64 128 16384 4542 8311 15738 28955 38273 36644 38530 38527 8192 6000 10413 19208 33878 65927 76906 77083 77102 4096 8931 14971 24794 44223 83512 144867 147581 150702 2048 14375 23489 34364 56887 103053 192662 307167 309222 1024 21647 36022 49649 77163 132886 243296 421389 497581 512 31832 61257 79545 108782 176341 303836 517814 584741 Quite a nice improvement for such a small patch. As they say, "every small factor of 17 helps". ;) What bothers me a bit is that the single-threaded numbers took such a noticeable hit... > Ok, the patch below takes the numbers on my test setup on a 16k IO > size: > > seqrd 1 2 4 8 16 > vanilla 3603 2798 2563 not tested... > patches 3707 5746 10304 12875 11016 ...in particular when your numbers improve even for a single thread. Wonder what's going on here. Anyway, feel free to add a Tested-By: or something from me. And maybe fix the two typos below. > xfs: don't serialise direct IO reads on page cache checks > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > There is no need to grab the i_mutex of the IO lock in exclusive > mode if we don't need to invalidate the page cache. Taking hese ^ > locks on every direct IO effective serialisaes them as taking the IO ^ > lock in exclusive mode has to wait for all shared holders to drop > the lock. That only happens when IO is complete, so effective it > prevents dispatch of concurrent direct IO reads to the same inode. > > Fix this by taking the IO lock shared to check the page cache state, > and only then drop it and take the IO lock exclusively if there is > work to be done. Hence for the normal direct IO case, no exclusive > locking will occur. > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > --- > fs/xfs/linux-2.6/xfs_file.c | 17 ++++++++++++++--- > 1 files changed, 14 insertions(+), 3 deletions(-) > > diff --git a/fs/xfs/linux-2.6/xfs_file.c b/fs/xfs/linux-2.6/xfs_file.c > index 1e641e6..16a4bf0 100644 > --- a/fs/xfs/linux-2.6/xfs_file.c > +++ b/fs/xfs/linux-2.6/xfs_file.c > @@ -321,7 +321,19 @@ xfs_file_aio_read( > if (XFS_FORCED_SHUTDOWN(mp)) > return -EIO; > > - if (unlikely(ioflags & IO_ISDIRECT)) { > + /* > + * Locking is a bit tricky here. If we take an exclusive lock > + * for direct IO, we effectively serialise all new concurrent > + * read IO to this file and block it behind IO that is currently in > + * progress because IO in progress holds the IO lock shared. We only > + * need to hold the lock exclusive to blow away the page cache, so > + * only take lock exclusively if the page cache needs invalidation. > + * This allows the normal direct IO case of no page cache pages to > + * proceeed concurrently without serialisation. > + */ > + xfs_rw_ilock(ip, XFS_IOLOCK_SHARED); > + if ((ioflags & IO_ISDIRECT) && inode->i_mapping->nrpages) { > + xfs_rw_iunlock(ip, XFS_IOLOCK_SHARED); > xfs_rw_ilock(ip, XFS_IOLOCK_EXCL); > > if (inode->i_mapping->nrpages) { > @@ -334,8 +346,7 @@ xfs_file_aio_read( > } > } > xfs_rw_ilock_demote(ip, XFS_IOLOCK_EXCL); > - } else > - xfs_rw_ilock(ip, XFS_IOLOCK_SHARED); > + } > > trace_xfs_file_read(ip, size, iocb->ki_pos, ioflags); > Jörn -- Everything should be made as simple as possible, but not simpler. -- Albert Einstein -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html