Re: [Lsf-pc] [LSF/MM ATTEND] Filesystems -- Btrfs, cgroups, Storage topics from Facebook

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello, Chris, Jan.

On Thu, Jan 02, 2014 at 03:21:15PM +0000, Chris Mason wrote:
> On Thu, 2014-01-02 at 07:46 +0100, Jan Kara wrote:
> > In an ideal world you could compute writeback throughput for each memcg
> > (and writeback from a memcg would be accounted in a proper blkcg - we would
> > need unified memcg & blkcg hieararchy for that), take into account number of
> > dirty pages in each memcg, and compute dirty rate according to these two
> > numbers. But whether this can work in practice heavily depends on the memcg
> > size and how smooth / fair can the writeback from different memcgs be so
> > that we don't have excessive stalls and throughput estimation errors...
> 
> [ Adding Tejun, Vivek and Li from another thread ]
> 
> I do agree that a basket of knobs is confusing and it doesn't really
> help the admin.
> 
> My first idea was a complex system where the controller in the block
> layer and the BDI flushers all communicated about current usage and
> cooperated on a single set of reader/writer rates.  I think it could
> work, but it'll be fragile.

One thing I do agree is that bdi would have to play some role.

> But there are a limited number of non-pagecache methods to do IO.  Why
> not just push the accounting and throttling for O_DIRECT into a new BDI
> controller idea?  Tejun was just telling me how he'd rather fix the
> existing controllers than add a new one, but I think we can have a much
> better admin experience by having a having a single entry point based on
> BDIs.

But if we'll have to make bdis blkcg-aware, I think the better way to
do is splitting it per cgroup.  That's what's being don in the lower
layer anyway.  We split request queues to multiple queues according to
cgroup configuration.  Things which can affect request issue and
completion, such as request allocation, are also split and each such
split queue is used for resource provisioning.

What we're missing is a way to make such split visible in the upper
layers for writeback.  It seems rather clear to me that that's the
right way to approach the problem rather than implementing separate
control for writebacks and somehow coordinate that with the rest.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux