Re: [PATCH] jbd jbd2: fix dio write returning EIO whentry_to_release_page fails

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed 06-08-08 09:25:13, Chris Mason wrote:
> On Tue, 2008-08-05 at 14:17 -0700, Mingming Cao wrote:
> > 在 2008-08-05二的 12:17 -0400,Chris Mason写道:
> > > On Tue, 2008-08-05 at 13:51 +0900, Hisashi Hifumi wrote:
> > > > >> > 
> > > > >> > diff -Nrup linux-2.6.27-rc1.org/fs/jbd/transaction.c 
> > > > >linux-2.6.27-rc1/fs/jbd/transaction.c
> > > > >> > --- linux-2.6.27-rc1.org/fs/jbd/transaction.c	2008-07-29 
> > > > >19:28:47.000000000 +0900
> > > > >> > +++ linux-2.6.27-rc1/fs/jbd/transaction.c	2008-07-29 20:40:12.000000000 +0900
> > > > >> > @@ -1764,6 +1764,12 @@ int journal_try_to_free_buffers(journal_
> > > > >> >  	*/
> > > > >> >  	if (ret == 0 && (gfp_mask & __GFP_WAIT) && (gfp_mask & __GFP_FS)) {
> > > > >> >  		journal_wait_for_transaction_sync_data(journal);
> > > > >> > +
> > > > >> > +		bh = head;
> > > > >> > +		do {
> > > > >> > +			while (atomic_read(&bh->b_count))
> > > > >> > +				schedule();
> > > > >> > +		} while ((bh = bh->b_this_page) != head);
> > > > >> >  		ret = try_to_free_buffers(page);
> > > > >> >  	}
> > > > >> 
> > > > >> The loop is problematic.  If the scheduler decides to keep running this
> > > > >> task then we have a busy loop.  If this task has realtime policy then
> > > > >> it might even lock up the kernel.
> > > > >> 
> > > > >
> > > > >ocfs2 calls journal_try_to_free_buffers too, looping on b_count might
> > > > >not be the best idea there either.
> > > > >
> > > > >This code gets called from releasepage, which is used other places than
> > > > >the O_DIRECT invalidation paths, I'd be worried about performance
> > > > >problems here.
> > > > >
> > > > 
> > > > try_to_release_page has gfp_mask parameter. So when try_to_releasepage
> > > > is called from performance sensitive part, gfp_mask should not be set.
> > > > b_count check loop is inside of (gfp_mask & __GFP_WAIT) && (gfp_mask & __GFP_FS) check.
> > > 
> > > Looks like try_to_free_pages will go into releasepage with wait & fs
> > > both set.  This kind of change would make me very nervous.
> > > 
> > 
> > Hi Chris,
> > 
> > The gfp_mask try_to_free_pages() takes from it's caller will past it
> > down to try_to_release_page().  Based on the meaning of __GFP_WAIT and
> > GFP_FS, if the upper level caller set these two flags,  I assume the
> > upper level caller expect delay and wait for fs to finish?
> > 
> > 
> > But I agree that using a loop in journal_try_to_free_buffers() to wait
> > for the busy bh release the counter is expensive...
> 
> I rediscovered your old thread about trying to do this in a launder_page
> call ;)
  Yes, we thought about using launder_page() before :).

> Does it make more sense to fix do_launder_page to call into the FS on
> every page, and let the FS check for PageDirty on its own?  That way
> invalidate_inode_pages2_range basically gets its own private call into
> the FS that says wait around until this page is really free.
  That would certainly work as well. But IMHO waiting for ->writepage()
call to finish isn't really a big deal even in try_to_release_page() if
__GFP_FS (and __GFP_WAIT) is set. The only problem is that there is no
effective way to do so and so Hisashi used that "wait for b_count to drop"
which looks really scary and I don't like it as well.

									Honza
-- 
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux