Hello, On Fri, Feb 13, 2015 at 10:20:44AM +0800, yy wrote: > Dave, > > Thank you very much for your explanation. > > I hit this issue when run MySQL on XFS. Direct IO is very import for > MySQL on XFS,but I can’t found any document explanation this > problem.Maybe this will cause great confusion for other MySQL users > also, so maybe this problem should be explained in XFS document. I don't think this is something that should be explained in XFS documentation, but at filesystems documentation in general. Once xfs follows the POSIX requirements, it's not a "out of standards" behavior, but otherwise, so, I agree that this should be something documented, but not exactly in XFS itself. > Best regards, > yy > 原始邮件 > 发件人: Dave Chinner<david@xxxxxxxxxxxxx> > 收件人: yy<yy@xxxxxxxxxxx> > 抄送: xfs<xfs@xxxxxxxxxxx>; Eric Sandeen<sandeen@xxxxxxxxxxx>; > bfoster<bfoster@xxxxxxxxxx> > 发送时间: 2015年2月13日(周五) 05:04 > 主题: Re: XFS buffer IO performance is very poor > On Thu, Feb 12, 2015 at 02:59:52PM +0800, yy wrote: > > In functionxfs_file_aio_read, will requestXFS_IOLOCK_SHARED lock > > for both direct IO and buffered IO: > > > so write will prevent read in XFS. > > > > However, in function generic_file_aio_read for ext3, will not > > lockinode-i_mutex, so write will not prevent read in ext3. > > > > I think this maybe the reason of poor performance for XFS. I do > > not know if this is a bug, or design flaws of XFS. > > This is a bug and design flaw in ext3, and most other Linux > filesystems. Posix states that write() must execute atomically and > so no concurrent operation that reads or modifies data should should > see a partial write. The linux page cache doesn't enforce this - a > read to the same range as a write can return partially written data > on page granularity, as read/write only serialise on page locks in > the page cache. > > XFS is the only Linux filesystem that actually follows POSIX > requirements here - the shared/exclusive locking guarantees that a > buffer write completes wholly before a read is allowed to access the > data. There is a down side - you can't run concurrent buffered reads > and writes to the same file - if you need to do that then that's > what direct IO is for, and coherency between overlapping reads and > writes is then the application's problem, not the filesystem... > > Maybe at some point in the future we might address this with ranged > IO locks, but there really aren't many multithreaded programs that > hit this issue... > > Cheers, > > Dave. > -- > Dave Chinner > [1]david@xxxxxxxxxxxxx > > References > > 1. mailto:david@xxxxxxxxxxxxx > _______________________________________________ > xfs mailing list > xfs@xxxxxxxxxxx > http://oss.sgi.com/mailman/listinfo/xfs -- Carlos _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs