On Thu, Feb 08, 2007 at 09:33:05AM +0530, Suparna Bhattacharya wrote: > On Wed, Feb 07, 2007 at 01:05:44PM -0500, Chris Mason wrote: > > On Wed, Feb 07, 2007 at 10:38:45PM +0530, Suparna Bhattacharya wrote: > > > > + * The flags parameter is a bitmask of: > > > > + * > > > > + * DIO_PLACEHOLDERS (use placeholder pages for locking) > > > > + * DIO_CREATE (pass create=1 to get_block for filling holes or extending) > > > > > > A little more explanation about why these options are needed, and examples > > > of when one would specify each of these options would be good. > > > > I'll extend the comments in the patch, but for discussion here: > > > > DIO_PLACEHOLDERS: placeholders are inserted into the page cache to > > synchronize the DIO with buffered writes. From a locking point of view, > > this is similar to inserting and locking pages in the address space > > corresponding to the DIO. > > > > placeholders guard against concurrent allocations and truncates during the DIO. > > You don't need placeholders if truncates and allocations are are > > impossible (for example, on a block device). > > Likewise placeholders may not be needed if the underlying filesystem > already takes care of locking to synchronizes DIO vs buffered. True, although I don't think any FS covers 100% of the cases right now. > > > > > DIO_CREATE: placeholders make it possible for filesystems to safely fill > > holes and extend the file via get_block during the DIO. If DIO_CREATE > > is turned on, get_block will be called with create=1, allowing the FS to > > allocate blocks during the DIO. > > When would one NOT specify DIO_CREATE, and what are the implications ? > The purpose of having an option of NOT allowing the FS to allocate blocks > during DIO is one is not very intuitive from the standpoint of the caller. > (the block device case could be an example, but then create=1 could not do > any harm or add extra overhead, so why bother ?) DIO has fallen back to buffered IO for so long that I wanted filesystems to explicitly choose the create=1 for now. A good example is my patch for ext3, where the ext3 get_block routine needed to be changed to start a transaction instead of finding the current trans in current->journal_info. The reiserfs DIO get_block needed to be told not to expect i_mutex to be held, etc etc. > > Is there still a valid case where we fallback to buffered IO to fill holes > - to me that seems to be the only situation where create=0 must be enforced. Right, when create=0 we fall back, otherwise we don't. > > > > > DIO_DROP_I_MUTEX: If the write is inside of i_size, i_mutex is dropped > > during the DIO and taken again before returning. > > Again an example of when one would not specify this (block device and > XFS ?) would be useful. If the FS can't fill a hole or extend the file without i_mutex, or if the caller has already dropped I_MUTEX themselves. I think this is only XFS right now, the long term goal is to make placeholders fast enough for XFS to use. -chris - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html