Re: [PATCH] ext4: fix ext4_evict_inode() racing against workqueue processing code

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 20, 2013 at 09:14:42AM -0500, Eric Sandeen wrote:
> 
> As an aside, is there any reason to have "dioread_nolock" as an option
> at this point?  If it works now, would you ever *not* want it?
> 
> (granted it doesn't work with some journaling options etc, but that
> behavior could be automatic, w/o the need for special mount options).

The primary restriction is that diread_nolock doesn't work when fs
block size != page size.  If your proposal is that we automatically
enable diread_nolock when we can use it safely, that's definitely
something to consider for the next merge window.

My long range plan/hope is that we eventually be able to use the
extent status tree so that we do allocating writes, we first (a)
allocate the blocks, and mark them as in use as far as the mballoc
data structures are concerned, but we do _not_ mark them as in use in
the on-disk allocation bitmaps, then (b) we write the data blocks, and
then triggered by the block I/O completion, (c) in a single journal
trnasaction, we update the allocation bitmaps, update the inode's
extent tree, and update the inode's i_size field.

This is different from the dioread_nolock approach in that we're not
initially inserting the blocks in the extent tree as uninitialized,
and then convert the extent tree entries from uninit to init after the
I/O completion.

If we get to this long-term nirvana, then (1) we can eliminate the
data=writeback vs data=ordered distiction, since we'll have the safety
benefits of data=ordered while still having the performance
characteristics of data=writeback, and (2) we can eliminate
diread_nolock, since this approach should also obviate needing to take
the read lock on the direct I/O read path.  I also think this approach
in the long term will be simpler and faster, since we don't have
modify the extent tree, and start a journal transaction, before we
write the data blocks.

					- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux