On Sat, Apr 7, 2012 at 3:00 AM, Jan Kara <jack@xxxxxxx> wrote: > Hi Vivek, > > On Wed 04-04-12 10:51:34, Vivek Goyal wrote: >> On Tue, Apr 03, 2012 at 11:36:55AM -0700, Tejun Heo wrote: >> [..] >> > IIUC, without cgroup, the current writeback code works more or less >> > like this. Throwing in cgroup doesn't really change the fundamental >> > design. Instead of a single pipe going down, we just have multiple >> > pipes to the same device, each of which should be treated separately. >> > Of course, a spinning disk can't be divided that easily and their >> > performance characteristics will be inter-dependent, but the place to >> > solve that problem is where the problem is, the block layer. >> >> How do you take care of thorottling IO to NFS case in this model? Current >> throttling logic is tied to block device and in case of NFS, there is no >> block device. > Yeah, for throttling NFS or other network filesystems we'd have to come > up with some throttling mechanism at some other level. The problem with > throttling at higher levels is that you have to somehow extract information > from lower levels about amount of work so I'm not completely certain now, > where would be the right place. Possibly it also depends on the intended > usecase - so far I don't know about any real user for this functionality... Remember to distinguish between the two ends of the network file system. There are slightly different problems. The client has to be able to expose the number of requests (and size of writes, or equivalently number of pages it can write at one time) so that writeback is not done too aggressively. File servers have to be able to discover the i/o limits dynamically of the underlying volume (not the block device, but potentially a pool of devices) so it can tell the client how much i/o it can send. For SMB2 server (Samba) and eventually for NFS, how many simultaneous requests it can support will allow them to sanely set the number of "credits" on each response - ie tell the client how many requests are allowed in flight to a particular export. In the case of block device throttling - other than the file system internally using such APIs who would use block device specific throttling - only the file system knows where it wants to put hot data, and in the case of btrfs, doesn't the file system manage the storage pool. The block device should be transparent to the user in the long run, and only the volume visible. -- Thanks, Steve -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html