On Wed, Mar 16 2016, Theodore Ts'o wrote: > On Wed, Mar 16, 2016 at 09:33:13AM +1100, Dave Chinner wrote: >> >> Stale data escaping containment is a security issue. Enabling >> generic kernel mechanisms to *enable containment escape* is >> fundamentally wrong, and relying on userspace to Do The Right Thing >> is even more of a gamble, IMO. > > We already have generic kernel mechanisms such as "the block device". P This is a bit of an 'off-the-wall' suggestion, but I agree that these things that might be of value to user-space file servers do seem a lot like block devices. So why not make them look exactly like block devices? i.e. a new open flag O_BLOCKDEV which, when combined with O_CREAT creates a thing that is managed in the filesystem much like a file, but that appears to user-space like a block device. The major/minor numbers would be essentially meaningless - the filesystem wouldn't call init_special_inode() like it does on normal block devices, it would retain control itself. That would make the content invisible to backups and rsync and all the things that Dave has raised as potential concerns. And it would be no surprise if the contents included stale data because that is exactly what you get when you create a new logical volume with LVM2. The block device would initially be of size zero, but could be resized using fallocate (which soon will work on block devices), which can request zeros, leave holes, or with Teds new FALLOC flag (that would only be permitted on block devices) could allocate uninitialized space. Rules for using O_BLOCKDEV would still need to be clarified - mount option, access to underlying block device, CAP_MKNOD .. whatever. I think that being able to use a filesystem as a logical volume manager is an extremely interesting idea.... we might even end up with a filesystem interface on device-mapper :-) NeilBrown
Attachment:
signature.asc
Description: PGP signature