On Thu, May 07, 2009 at 11:04:50AM +0200, Andrea Righi wrote: > On Wed, May 06, 2009 at 05:52:35PM -0400, Vivek Goyal wrote: > > > > Without io-throttle patches > > > > --------------------------- > > > > - Two readers, first BE prio 7, second BE prio 0 > > > > > > > > 234179072 bytes (234 MB) copied, 4.12074 s, 56.8 MB/s > > > > High prio reader finished > > > > 234179072 bytes (234 MB) copied, 5.36023 s, 43.7 MB/s > > > > > > > > Note: There is no service differentiation between prio 0 and prio 7 task > > > > with io-throttle patches. > > > > > > > > Test 3 > > > > ====== > > > > - Run the one RT reader and one BE reader in root cgroup without any > > > > limitations. I guess this should mean unlimited BW and behavior should > > > > be same as with CFQ without io-throttling patches. > > > > > > > > With io-throttle patches > > > > ========================= > > > > Ran the test 4 times because I was getting different results in different > > > > runs. > > > > > > > > - Two readers, one RT prio 0 other BE prio 7 > > > > > > > > 234179072 bytes (234 MB) copied, 2.74604 s, 85.3 MB/s > > > > 234179072 bytes (234 MB) copied, 5.20995 s, 44.9 MB/s > > > > RT task finished > > > > > > > > 234179072 bytes (234 MB) copied, 4.54417 s, 51.5 MB/s > > > > RT task finished > > > > 234179072 bytes (234 MB) copied, 5.23396 s, 44.7 MB/s > > > > > > > > 234179072 bytes (234 MB) copied, 5.17727 s, 45.2 MB/s > > > > RT task finished > > > > 234179072 bytes (234 MB) copied, 5.25894 s, 44.5 MB/s > > > > > > > > 234179072 bytes (234 MB) copied, 2.74141 s, 85.4 MB/s > > > > 234179072 bytes (234 MB) copied, 5.20536 s, 45.0 MB/s > > > > RT task finished > > > > > > > > Note: Out of 4 runs, looks like twice it is complete priority inversion > > > > and RT task finished after BE task. Rest of the two times, the > > > > difference between BW of RT and BE task is much less as compared to > > > > without patches. In fact once it was almost same. > > > > > > This is strange. If you don't set any limit there shouldn't be any > > > difference respect to the other case (without io-throttle patches). > > > > > > At worst a small overhead given by the task_to_iothrottle(), under > > > rcu_read_lock(). I'll repeat this test ASAP and see if I'll be able to > > > reproduce this strange behaviour. > > > > Ya, I also found this strange. At least in root group there should not be > > any behavior change (at max one might expect little drop in throughput > > because of extra code). > > Hi Vivek, > > I'm not able to reproduce the strange behaviour above. > > Which commands are you running exactly? is the system isolated (stupid > question) no cron or background tasks doing IO during the tests? > > Following the script I've used: > > $ cat test.sh > #!/bin/sh > echo 3 > /proc/sys/vm/drop_caches > ionice -c 1 -n 0 dd if=bigfile1 of=/dev/null bs=1M 2>&1 | sed "s/\(.*\)/RT: \1/" & > cat /proc/$!/cgroup | sed "s/\(.*\)/RT: \1/" > ionice -c 2 -n 7 dd if=bigfile2 of=/dev/null bs=1M 2>&1 | sed "s/\(.*\)/BE: \1/" & > cat /proc/$!/cgroup | sed "s/\(.*\)/BE: \1/" > for i in 1 2; do > wait > done > > And the results on my PC: > > 2.6.30-rc4 > ~~~~~~~~~~ > $ sudo sh test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 21.3406 s, 11.5 MB/s > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 11.989 s, 20.5 MB/s > $ sudo sh test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 23.4436 s, 10.5 MB/s > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 11.9555 s, 20.5 MB/s > $ sudo sh test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 21.622 s, 11.3 MB/s > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 11.9856 s, 20.5 MB/s > $ sudo sh test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 21.5664 s, 11.4 MB/s > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 11.8522 s, 20.7 MB/s > > 2.6.30-rc4 + io-throttle, no BW limit, both tasks in the root cgroup > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > $ sudo sh ./test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 23.6739 s, 10.4 MB/s > BE: cgroup 4:blockio:/ > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 12.2853 s, 20.0 MB/s > RT: 4:blockio:/ > $ sudo sh ./test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 23.7483 s, 10.3 MB/s > BE: cgroup 4:blockio:/ > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 12.3597 s, 19.9 MB/s > RT: 4:blockio:/ > $ sudo sh ./test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 23.6843 s, 10.4 MB/s > BE: cgroup 4:blockio:/ > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 12.4886 s, 19.6 MB/s > RT: 4:blockio:/ > $ sudo sh ./test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 23.8621 s, 10.3 MB/s > BE: cgroup 4:blockio:/ > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 12.6737 s, 19.4 MB/s > RT: 4:blockio:/ > > The difference seems to be just the expected overhead. BTW, it is possible to reduce the io-throttle overhead even more for non io-throttle users (also when CONFIG_CGROUP_IO_THROTTLE is enabled) using the trick below. 2.6.30-rc4 + io-throttle + following patch, no BW limit, tasks in root cgroup ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ $ sudo sh test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 17.462 s, 14.1 MB/s BE: 4:blockio:/ RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 11.7865 s, 20.8 MB/s RT: 4:blockio:/ $ sudo sh test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 18.8375 s, 13.0 MB/s BE: 4:blockio:/ RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 11.9148 s, 20.6 MB/s RT: 4:blockio:/ $ sudo sh test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 19.6826 s, 12.5 MB/s BE: 4:blockio:/ RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 11.8715 s, 20.7 MB/s RT: 4:blockio:/ $ sudo sh test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 18.9152 s, 13.0 MB/s BE: 4:blockio:/ RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 11.8925 s, 20.6 MB/s RT: 4:blockio:/ [ To be applied on top of io-throttle v16 ] Signed-off-by: Andrea Righi <righi.andrea@xxxxxxxxx> --- block/blk-io-throttle.c | 16 ++++++++++++++-- 1 files changed, 14 insertions(+), 2 deletions(-) diff --git a/block/blk-io-throttle.c b/block/blk-io-throttle.c index e2dfd24..8b45c71 100644 --- a/block/blk-io-throttle.c +++ b/block/blk-io-throttle.c @@ -131,6 +131,14 @@ struct iothrottle_node { struct iothrottle_stat stat; }; +/* + * This is a trick to reduce the unneded overhead when io-throttle is not used + * at all. We use a counter of the io-throttle rules; if the counter is zero, + * we immediately return from the io-throttle hooks, without accounting IO and + * without checking if we need to apply some limiting rules. + */ +static atomic_t iothrottle_node_count __read_mostly; + /** * struct iothrottle - throttling rules for a cgroup * @css: pointer to the cgroup state @@ -193,6 +201,7 @@ static void iothrottle_insert_node(struct iothrottle *iot, { WARN_ON_ONCE(!cgroup_is_locked()); list_add_rcu(&n->node, &iot->list); + atomic_inc(&iothrottle_node_count); } /* @@ -214,6 +223,7 @@ iothrottle_delete_node(struct iothrottle *iot, struct iothrottle_node *n) { WARN_ON_ONCE(!cgroup_is_locked()); list_del_rcu(&n->node); + atomic_dec(&iothrottle_node_count); } /* @@ -250,8 +260,10 @@ static void iothrottle_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp) * reference to the list. */ if (!list_empty(&iot->list)) - list_for_each_entry_safe(n, p, &iot->list, node) + list_for_each_entry_safe(n, p, &iot->list, node) { kfree(n); + atomic_dec(&iothrottle_node_count); + } kfree(iot); } @@ -836,7 +848,7 @@ cgroup_io_throttle(struct bio *bio, struct block_device *bdev, ssize_t bytes) unsigned long long sleep; int type, can_sleep = 1; - if (iothrottle_disabled()) + if (iothrottle_disabled() || !atomic_read(&iothrottle_node_count)) return 0; if (unlikely(!bdev)) return 0; -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel