Re: [PATCH 06/13] writeback: bdi write bandwidth estimation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2010-11-24 at 21:46 +0800, Wu Fengguang wrote:
> On Wed, Nov 24, 2010 at 09:42:09PM +0800, Peter Zijlstra wrote:
> > On Wed, 2010-11-24 at 21:20 +0800, Wu Fengguang wrote:
> > > >         (jiffies - bdi->write_bandwidth_update_time < elapsed)
> > > 
> > > this will be true if someone else has _done_ overlapped estimation,
> > > otherwise it will equal:
> > > 
> > >         jiffies - bdi->write_bandwidth_update_time == elapsed
> > > 
> > > Sorry the comment needs updating. 
> > 
> > Right, but its racy as hell..
> 
> Yeah, for N concurrent dirtiers, plus the background flusher, only
> one is able to update write_bandwidth[_update_time]..

Wrong, nr_cpus are, they could all observe the old value before seeing
the update of the variable.

Why not something like the below, which keeps the stamps per bdi and
serializes on a lock (trylock, you only need a single updater at any one
time anyway):

probably got the math wrong, but the idea should be clear, you can even
add an explicit bdi_update_bandwidth_stamps() function which resets the
stamps to the current situation in order to skip periods of low
throughput (that would need to do spin_lock).

---
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 4ce34fa..de690c3 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -40,6 +40,7 @@ typedef int (congested_fn)(void *, int);
 enum bdi_stat_item {
 	BDI_RECLAIMABLE,
 	BDI_WRITEBACK,
+	BDI_WRITTEN,
 	NR_BDI_STAT_ITEMS
 };
 
@@ -88,6 +89,11 @@ struct backing_dev_info {
 
 	struct timer_list laptop_mode_wb_timer;
 
+	spinlock_t bw_lock;
+	unsigned long bw_time_stamp;
+	unsigned long bw_write_stamp;
+	int write_bandwidth;
+
 #ifdef CONFIG_DEBUG_FS
 	struct dentry *debug_dir;
 	struct dentry *debug_stats;
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 027100d..a934fe9 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -661,6 +661,11 @@ int bdi_init(struct backing_dev_info *bdi)
 	bdi->dirty_exceeded = 0;
 	err = prop_local_init_percpu(&bdi->completions);
 
+	spin_lock_init(&bdi->bw_lock);
+	bdi->bw_time_stamp = jiffies;
+	bdi->bw_write_stamp = 0;
+	bdi->write_bandwidth = 100 << (20 - PAGE_SHIFT); /* 100 MB/s */
+
 	if (err) {
 err:
 		while (i--)
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index b840afa..f3f5c24 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -219,6 +219,7 @@ int dirty_bytes_handler(struct ctl_table *table, int write,
  */
 static inline void __bdi_writeout_inc(struct backing_dev_info *bdi)
 {
+	__inc_bdi_state(bdi, BDI_WRITTEN);
 	__prop_inc_percpu_max(&vm_completions, &bdi->completions,
 			      bdi->max_prop_frac);
 }
@@ -238,6 +239,35 @@ void task_dirty_inc(struct task_struct *tsk)
 	prop_inc_single(&vm_dirties, &tsk->dirties);
 }
 
+void bdi_update_write_bandwidth(struct backing_dev_info *bdi)
+{
+	unsigned long time_now, write_now;
+	long time_delta, write_delta;
+	long bw;
+
+	if (!spin_try_lock(&bdi->bw_lock))
+		return;
+
+	write_now = bdi_stat(bdi, BDI_WRITTEN);
+	time_now = jiffies;
+
+	write_delta = write_now - bdi->bw_write_stamp;
+	time_delta = time_now - bdi->bw_time_stamp;
+
+	/* rate-limit, only update once every 100ms */
+	if (time_delta < HZ/10 || !write_delta)
+		goto unlock;
+
+	bdi->bw_write_stamp = write_now;
+	bdi->bw_time_stamp = time_now;
+
+	bw = write_delta * HZ / time_delta;
+	bdi->write_bandwidth = (bdi->write_bandwidth + bw + 3) / 4;
+
+unlock:
+	spin_unlock(&bdi->bw_lock);
+}
+
 /*
  * Obtain an accurate fraction of the BDI's portion.
  */

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux