On Wed, 13 Jun 2012 14:37:51 +0300 Artem Bityutskiy <dedekind1@xxxxxxxxx> wrote: > From: Artem Bityutskiy <artem.bityutskiy@xxxxxxxxxxxxxxx> > > This patch makes hfsplus stop using the VFS '->write_super()' method along with > the 's_dirt' superblock flag, because they are on their way out. > > The whole "superblock write-out" VFS infrastructure is served by the > 'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and > writes out all dirty superblocks using the '->write_super()' call-back. But the > problem with this thread is that it wastes power by waking up the system every > 5 seconds, even if there are no diry superblocks, or there are no client > file-systems which would need this (e.g., btrfs does not use > '->write_super()'). So we want to kill it completely and thus, we need to make > file-systems to stop using the '->write_super()' VFS service, and then remove > it together with the kernel thread. > > Tested using fsstress from the LTP project. > > > ... > > --- a/fs/hfsplus/hfsplus_fs.h > +++ b/fs/hfsplus/hfsplus_fs.h > @@ -153,8 +153,11 @@ struct hfsplus_sb_info { > gid_t gid; > > int part, session; > - > unsigned long flags; > + > + int work_queued; /* non-zero delayed work is queued */ This would be a little nicer if it had the bool type. > + struct delayed_work sync_work; /* FS sync delayed work */ > + spinlock_t work_lock; /* protects sync_work and work_queued */ I'm not sure that this lock really needs to exist. > -static void hfsplus_write_super(struct super_block *sb) > +static void delayed_sync_fs(struct work_struct *work) > { > - if (!(sb->s_flags & MS_RDONLY)) > - hfsplus_sync_fs(sb, 1); > - else > - sb->s_dirt = 0; > + struct hfsplus_sb_info *sbi; > + > + sbi = container_of(work, struct hfsplus_sb_info, sync_work.work); > + > + spin_lock(&sbi->work_lock); > + sbi->work_queued = 0; > + spin_unlock(&sbi->work_lock); Here it is "protecting" a single write. > + hfsplus_sync_fs(sbi->alloc_file->i_sb, 1); > +} > + > +void hfsplus_mark_mdb_dirty(struct super_block *sb) > +{ > + struct hfsplus_sb_info *sbi = HFSPLUS_SB(sb); > + unsigned long delay; > + > + if (sb->s_flags & MS_RDONLY) > + return; > + > + spin_lock(&sbi->work_lock); > + if (!sbi->work_queued) { > + delay = msecs_to_jiffies(dirty_writeback_interval * 10); > + queue_delayed_work(system_long_wq, &sbi->sync_work, delay); > + sbi->work_queued = 1; > + } > + spin_unlock(&sbi->work_lock); > } And I think it could be made to go away here, perhaps by switching to test_and_set_bit or similar. And I wonder about the queue_delayed_work(). iirc this does nothing to align timer expiries, so someone who has a lot of filesystems could end up with *more* timer wakeups. Shouldn't we do something here to make the system do larger amounts of work per timer expiry? Such as the timer-slack infrastructure? It strikes me that this whole approach improves the small system with little write activity, but makes things worse for the larger system with a lot of filesystems? -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html