Re: [PATCH] Properly init address_space->writeback_index

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2008-08-14 at 12:05 -0700, Andrew Morton wrote:
> On Thu, 14 Aug 2008 14:13:40 -0400
> Chris Mason <chris.mason@xxxxxxxxxx> wrote:
> 
> > write_cache_pages uses i_mapping->writeback_index to pick up where it
> > left off the last time a given inode was found by pdflush or
> > balance_dirty_pages (or anyone else who sets wbc->range_cyclic)
> > 
> > alloc_inode should set it to a sane value so that writeback doesn't start
> > in the middle of a file.  It is somewhat difficult to notice the bug since
> > write_cache_pages will loop around to the start of the file and the
> > elevator helps hide the resulting seeks.
> > 
> > For whatever reason, Btrfs hits this often.  Unpatched, untarring 30 copies of
> > the linux kernel in series runs at 47MB/s on a single sata drive.  With
> > this fix, it jumps to 62MB/s.

[ ... ]

> inode_cachep's ctor memsets the whole inode* to zero.
> 
> Does btrfs have an ->alloc_inode handler?  If so, perhaps it is btrfs
> which needs to zap that field.
> 

Yes, but I wasn't touching ->i_mapping at all.  I do have a patch in
btrfs now to zero the writeback_index field for now though.  

> Formally, the way slab works (which is arguably silly from a CPU cache
> efficiency POV) is that the object should be returned to slab
> (kmem_cache_free()) in a "constructed" manner.  Which means that this
> zeroing should happen in destroy_inode().
>
> otoh, the fs/inode.c inode creation/destruction code isn't designed
> that way - it chose to initialise everything synchronously in
> alloc_inode(), which is arguably wrong.
> 

At least for filesystems, it seems much less error prone to force us to
init on alloc.

> One wonders if that big memset in inode_init_once() actually does
> anything useful.
>
>
> Ho hum, so what to do?  I'm suspecting that we should just zap the
> kmem_cachep constructor altogether, use a synchronous call to
> inode_init_once() to initialise all inode fields and then remove all
> those open-coded zeroings.
> 
> 
> Something like this?
> 

[ looks good ]

> 
> Note that this assumes that a filesystem which implements
> ->alloc_inode() will call inode_init_once() within its ->alloc_inode().
> 
> Which means, I think, that we can just remove inode_init_once()
> altogether and move its initialisations into alloc_inode() along with
> all the existing ones.
> 
> What do you think?
> 

There's the silent breakage of out of tree FS, but, your patch seems
much cleaner to me.

> 
> I suppose that's all a fun project but doesn't preclude merging your
> patchlet.  I'll do that for now ;)
> 

;) thanks.

-chris


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux