On Tue, 19 Aug 2008 16:03:45 +0900 Hisashi Hifumi <hifumi.hisashi@xxxxxxxxxxxxx> wrote: > > At 21:59 08/08/13, Chris Mason wrote: > >On Wed, 2008-08-13 at 12:16 +0200, Jan Kara wrote: > > > >> > With that said, I don't have strong feelings against falling back to > >> > buffered IO when the invalidate fails. Maybe Zach remembers something I > >> > don't? > >> I don't have a strong opinion either. Falling back to buffered writes is > >> simpler at least for ext3/ext4 because properly synchronizing against > >> writepage() call does not seem to have a nice solution either in > >> do_launder_page() or in releasepage(). OTOH is hides the fact the invalidate > >> is failing and so if we screw up something in future and it fails often, it > >> might be hard to notice / track down the performance penalty. > > > >In general, these races don't happen often, and when they do it is > >because someone is mixing page cache and O_DIRECT io to the same file. > >That is explicitly outside the main use case of O_DIRECT. > > > >So, I'd rather see us slow down O_DIRECT in the mixed use case than have > >big impacts in complexity or speed to other parts of the kernel. If > >falling back avoids problems in some filesystems or avoids clearing the > >uptodate bit unexpectedly, I'd much rather take the fallback patch. > > > >-chris > > Hi Andrew. > I think we don't have strong feelings against falling back to buffered writes to > fix the direct-io -EIO problem. > > Please review my patch. > umm, what problem does it solve? > > diff -Nrup linux-2.6.27-rc3.org/mm/filemap.c linux-2.6.27-rc3/mm/filemap.c > --- linux-2.6.27-rc3.org/mm/filemap.c 2008-08-13 13:48:47.000000000 +0900 > +++ linux-2.6.27-rc3/mm/filemap.c 2008-08-19 15:45:31.000000000 +0900 > @@ -2129,13 +2129,20 @@ generic_file_direct_write(struct kiocb * > * After a write we want buffered reads to be sure to go to disk to get > * the new data. We invalidate clean cached page from the region we're > * about to write. We do this *before* the write so that we can return > - * -EIO without clobbering -EIOCBQUEUED from ->direct_IO(). > + * without clobbering -EIOCBQUEUED from ->direct_IO(). > */ > if (mapping->nrpages) { > written = invalidate_inode_pages2_range(mapping, > pos >> PAGE_CACHE_SHIFT, end); > - if (written) > + /* > + * If a page can not be invalidated, return 0 to fall back > + * to buffered write. > + */ > + if (written) { > + if (written == -EBUSY) > + return 0; > goto out; > + } > } > > written = mapping->a_ops->direct_IO(WRITE, iocb, iov, pos, *nr_segs); > diff -Nrup linux-2.6.27-rc3.org/mm/truncate.c linux-2.6.27-rc3/mm/truncate.c > --- linux-2.6.27-rc3.org/mm/truncate.c 2008-08-13 13:48:48.000000000 +0900 > +++ linux-2.6.27-rc3/mm/truncate.c 2008-08-19 12:10:46.000000000 +0900 > @@ -380,7 +380,7 @@ static int do_launder_page(struct addres > * Any pages which are found to be mapped into pagetables are unmapped prior to > * invalidation. > * > - * Returns -EIO if any pages could not be invalidated. > + * Returns -EBUSY if any pages could not be invalidated. > */ > int invalidate_inode_pages2_range(struct address_space *mapping, > pgoff_t start, pgoff_t end) > @@ -440,7 +440,7 @@ int invalidate_inode_pages2_range(struct > ret2 = do_launder_page(mapping, page); > if (ret2 == 0) { > if (!invalidate_complete_page2(mapping, page)) > - ret2 = -EIO; > + ret2 = -EBUSY; > } > if (ret2 < 0) > ret = ret2; If I recall correctly, we had a problem with pages which are pinned by an ext3 transaction, and those pages weren't releaseable for direct-io, and this caused some problem? I think falling back to buffered writes is always a safe course, but it'd be nice to have a full description of the change, please. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html