Re: automatic testing of cgroup writeback limiting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/01/2015 05:38 PM, Tejun Heo wrote:
As opposed to pages.  cgroup ownership is tracked per inode, not per
page, so if multiple cgroups write to the same inode at the same time,
some IOs will be incorrectly attributed.

I can't think of use cases where this could become a problem.
If more than one user/container/VM is allowed to write to the
same file at any one time, isolation is probably absent anyway ;-)

cgroup ownership is per-inode.  IO throttling is per-device, so as
long as multiple filesystems map to the same device, they fall under
the same limit.

Good, that's why I assumed it useful to include a scenario with more
than one filesystem on the same device into the test scenario, just
to know whether there are unexpected issues if more than one filesystem
utilizes the same underlying device.

Metadata IO not throttled - it is owned by the filesystem and hence
root cgroup.

Ouch. That kind of defeats the purpose of limiting evil processes'
ability to DOS other processes.

cgroup isn't a security mechanism and has to make active tradeoffs
between isolation and overhead.  It doesn't provide protection against
malicious users and in general it's a pretty bad idea to depend on
cgroup for protection against hostile entities.

I wrote of "evil" processes for simplicity, but 99 out of 100 times
it's not intentional "evilness" that makes a process exhaust I/O
bandwidth of some device shared with other users/containers/VMs, it's
usually just bugs, inconsiderate programming or inappropriate use
that makes one process write like crazy, making other
users/containers/VMs suffer.

Whereever strict service level guarantees are relevant, and
applications require writing to storage, you currently cannot
consolidate two or more applications onto the same physical host,
even if they run under separate users/containers/VMs.

I understand there is no short or medium term solution that
would allow to isolate processes writing to the same filesytem
(because of the meta data writing), but is it correct to say
that at least VMs, which do not allow the virtual guest to
cause extensive meta data writes on the physical host, only
writes into pre-allocated image files, can be safely isolated
by the new "buffered write accounting"?

If so, we'd have stay away from user or container based isolation
of independently SLA'd applications, but could at least resort to VMs
using image files on a shared filesystem.

Regards,

Lutz Vieweg
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux