Re: [LSF/MM TOPIC][ATTEND] protection information and userspace

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02/07/2013 11:01 AM, Darrick J. Wong wrote:
On Thu, Feb 07, 2013 at 01:40:14AM -0800, Joel Becker wrote:
On Wed, Feb 06, 2013 at 03:34:49PM -0500, Chuck Lever wrote:

On Feb 6, 2013, at 3:24 PM, "Darrick J. Wong" <darrick.wong@xxxxxxxxxx> wrote:

On Wed, Feb 06, 2013 at 01:51:22PM -0600, Ben Myers wrote:
Hi,

I'm interested in discussing how to pass protection information to and from
userspace.  Maybe Martin could be enlisted for the discussion.

I read that some work has already been done in this area but have not been able
to locate it.  It looks like the bio-integrity code already makes it possible
to generate the t10-dif crc in the filesystem.  It would be good to be able to
get the guard and application tags back out to backup applications such as
xfsdump.  Enabling other applications to generate their own tags in userspace
is also interesting.

This one's been on my list for a couple of years (and companies) too.  A few
years ago Joel Becker had support for it in his sys_dio proposal (that hasn't
gone anywhere), and more recently I've theorized that we could add a magic
fcntl/ioctl to make the kernel recognize, say, the first iovec of a O_DIRECT
*{read,write}v call as the PI buffer, which I think is similar to how DIX gets
PI data to a disk.  But it's not like I have any code to show for it.

I /think/ it's fairly straightforward to change the directio submit code to
find the userspace PI buffer and amend the block integrity code to attach our
own PI buffer.  You'd still have to let the block layer set the sector # field,
but afaik that won't affect the crc or the app tag.

I hear that the NFS guys want to propose some sort of protocol for transmitting
PI data (across NFS), but I haven't seen anything concrete yet.

I'm writing a requirements document for the NFS protocol which I can discuss at LSF.  The use cases for NFS for now would be virtual disk devices (hypervisors) or direct NFS access to storage from user space.

Like everyone else we are waiting for a magical VFS and user space API to appear that can pass PI to and from storage.

I'm happy to chat about it.  Unfortunately, like Darrick says, sys_dio()
coding hasn't happened.  I do think we're better off with some kind of
explicit API than some magic state on the file.  I mean, even something
like:

	ssize_t write_with_pi(int fd, const void *buf, size_t count,
			      const void *pi, size_t pi_count);

It's not as nice as a non-historical API (eg sys_dio), but it also
probably plays nicer with buffered I/O.

I also pondered simply adding a new io_prep_* function + IO_CMD_ code to libaio
and all the other plumbing necessary to make that happen...

void io_prep_preadv_pi(struct iocb *iocb, int fd, const struct iovec *iov,
		       int iovcnt, long long offset, const void *pi,
		       size_t pi_count);

This is also what I've envisioned.
Updating io_prep / async I/O is reasonably easy as its been using a separate structure for passing in the I/O details.

Normal read/write calls don't really map as you simply don't have enough parameter to feed PI information into the kernel.
So for that you'd need to invent a new interface / syscall.

For aio we just need to add additional fields to an existing structure.

So yeah, I'd be interested in that discussion as well.

Cheers,

Hannes
--
Dr. Hannes Reinecke		      zSeries & Storage
hare@xxxxxxx			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux