On 12/29/2011 02:22 PM, Isaku Yamahata wrote: > > > > A simpler approach is the open("/dev/umem") returns an mmap()able fd. > > You need to call an ioctl() to set the size, etc. but only you only > > operate on that fd. > > So you are suggesting that /dev/umem and /dev/umemctl should be introduced > and split the functionality. No; perhaps I'm missing some functionality, but I'm suggesting fd = open("/dev/umem"); ftruncate(fd, size); struct umem_config config = { ... }; ioctl(fd, UMEM_CONFIG, &config); mmap(..., fd, size); > > > > IMO, it would be better to avoid names, and use opaque __u64 identifiers > > assigned by userspace, or perhaps file descriptors generated by the > > kernel. With names come the complications of namespaces, etc. One user > > can DoS another by grabbing a name that it knows the other user wants to > > use. > > So how about the kernel assigning identifiers which is system global? Depends on what you do with the identifiers. Something like reattach needs security, you can't just reattach to any random umem segment. It's really best to stick with file descriptors, which already have a security model. > > > > > + > > > +struct umem_create { > > > + __u64 size; /* in bytes */ > > > + __s32 umem_fd; > > > + __s32 shmem_fd; > > > + __u32 async_req_max; > > > + __u32 sync_req_max; > > > + struct umem_name name; > > > +}; > > > + > > > +struct umem_page_request { > > > + __u64 __user *pgoffs; > > > > Pointers change their size in 32-bit and 64-bit userspace, best to avoid > > them. > > Ah yes, right. How about following? > struct { > __u32 nr; > __u32 padding; > __u64 pgoffs[0]; > } Sure. If we use a pipe to transport requests, you can just send them as a sequence of __u64 addresses. > > > + > > > +/* ioctl for umem fd */ > > > +#define UMEM_GET_PAGE_REQUEST _IOWR(UMEMIO, 0x10, struct umem_page_request) > > > +#define UMEM_MARK_PAGE_CACHED _IOW (UMEMIO, 0x11, struct umem_page_cached) > > > > You could make the GET_PAGE_REQUEST / MARK_PAGE_CACHED protocol run over > > file descriptors, instead of an ioctl. It allows you to implement the > > other side in either the kernel or userspace. This is similar to how > > kvm uses an eventfd for communication with vhost-net in the kernel, or > > an implementation in userspace. > > Do you mean that read/write on file descriptors is better than ioctl? > Okay, it would be easy to convert ioctl into read/write. Yes, they already provide synchronization. And if you want to implement a umem provider over RDMA in the kernel, then it's easy to add it; it's not trivial for the kernel to issue ioctls but reads/writes are easy. It's also easy to pass file descriptors among processes. How do FUSE/CUSE pass requests? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html