Hi all, The specific problem I need to solve was, userspace needs to pass a socket fd to my virtual device, so that it may connect (or reconnect) to a cluster. I've played with a few different ways of doing this, including using the proposed libdevmapper DM_TARGET_MSG interface and hacking my own custom socket connection interface. The proposed DM_TARGET_MSG interface will in fact do what I want, but it has some irritations: - Why bother ioctling the dm control device when we can ioctl the virtual device directly? - What is the advantage of passing the sector address instead a direct index into the target table? - Not passing the length of the message string is evil - I don't see the purpose of the __dev_status call in the target_message function - That poor abused ioctl struct doesn't need more cruft grafted onto it. I talked this over with Alasdair and, finding him receptive to at least some of these points, thought I might as well take a run at rerolling my custom socket interface as a more general message interface. The result of that was only a few lines longer than the single-purpose original, and pleases me at least. Let's see what you think of it. Some notes: - The message is copied via the kernel stack, so it has to be fairly small. The current concensus is seems to be that 64 bytes isn't too abusive. Otherwise we have to go kmalloc things, which can get scary in PF_MEMALLOC situations, or fuss around with preallocation. In my opinion, if we have more than 64 bytes to pass an ioctl is the wrong way to do it, but you can always pass a user space pointerin the message if you absolutely must. I for one am completely satisfied with a 64 byte message limit, and actually only need 4 or 8 bytes depending on platform. - Constructing a proper ioctl number is problematic since the length of the passed structure is dynamic. I used the length of the fixed part of this ioctl, which is 3 ints. The ioctl number construction kit stuff is stupid anyway, but at least I've made a valiant attempt to fit in. - I index the desired target directly instead of passing a sector address as in the DM_TARGET_MSG interface, because that way the application doesn't have to do arithmetic on the device table to figure out how to message the right target. This is _much_ saner from the application's point of view. In implementation, it's the same, one line either way. - I ioctl the virtual device directly, not the device mapper control device. This is much easier for the application, more efficient and less racy. - This is 2.4 patch. If I wasn't on 2.4 at the time I probably would have used the udm MSG interface, and just put up with the warts, but I wasn't, so I decided to try an alternate approach. (Figuring out libdevmapper was, um, fun.) - For 2.6 I'd have to restore the virtual device ioctl hook that got dropped somewhere along the way. That's ok, it's easy. Here is the prototype for the .message target method: int (*dm_message_fn)(struct dm_target *target, int tag, void *msg, int len); Here is an example call from userspace. ioctl(fd, DM_DEVMESSAGE, (int[4]){ 0, 0, sizeof(int), sock}) It's just one line, there's no need for a library. The hardest part is finding the header file that defines DM_DEVMESSAGE. The second 0 is the message tag. Actually, I must pass two different kinds of sockets, so there's no getting away from some kind of tag. The only question is, should it be in the message, or explicitly handled as a parameter to the target method. If it were done the second way, the call would look like: ioctl(fd, DM_DEVMESSAGE, (int[4]){ 0, 2*sizeof(int), 0, sock}) So it is marginally nicer with the tag handled explicitly, and only a miniscule amount of extra code. A message generally needs a tag, so... After I get my socket connected there's no further need for ioctl traffic to my device since further commands to go over the socket, except that if the socket breaks, it needs to be reconnected via the message ioctl. Regards, Daniel diff -up --recursive 2.4.26.csnap.clean/drivers/md/dm.c 2.4.26.csnap/drivers/md/dm.c --- 2.4.26.csnap.clean/drivers/md/dm.c 2004-07-05 01:08:46.000000000 +0000 +++ 2.4.26.csnap/drivers/md/dm.c 2004-07-09 00:13:39.000000000 +0000 @@ -16,6 +16,7 @@ #include <linux/major.h> #include <linux/kdev_t.h> #include <linux/lvm.h> +#include <linux/dm-ioctl.h> #include <asm/uaccess.h> @@ -451,6 +452,30 @@ static inline sector_t volume_size(kdev_ return blk_size[major(dev)][minor(dev)] << 1; } +static int dm_message(struct inode *inode, int ith, int tag, void *msg, int len) +{ + struct mapped_device *md; + struct dm_table *table; + struct dm_target *target; + int err = -ENXIO; + + if (!(md = get_kdev(inode->i_rdev))) + goto out; + if (!(table = dm_get_table(md))) + goto out1; + if (ith >= dm_table_get_num_targets(table)) + goto out2; + target = dm_table_get_target(table, ith); + if (target->type->message) + err = target->type->message(target, tag, msg, len); +out2: + dm_table_put(table); +out1: + dm_put(md); +out: + return err; +} + /* FIXME: check this */ static int dm_blk_ioctl(struct inode *inode, struct file *file, unsigned int command, unsigned long a) @@ -492,6 +517,17 @@ static int dm_blk_ioctl(struct inode *in case LV_BMAP: return dm_user_bmap(inode, (struct lv_bmap *) a); + case DM_MESSAGE: { + int head[3]; + if (!copy_from_user(head, (void *)a, sizeof(head))) { + int ith = head[0], tag = head[1], len = head[2]; + char msg[64]; + if (len <= 64 && !copy_from_user(msg, (int *)a + 3, len)) + return dm_message(inode, ith, tag, msg, len); + } + return -EFAULT; + } + default: DMWARN("unknown block ioctl 0x%x", command); return -ENOTTY; diff -up --recursive 2.4.26.csnap.clean/include/linux/device-mapper.h 2.4.26.csnap/include/linux/device-mapper.h --- 2.4.26.csnap.clean/include/linux/device-mapper.h 2004-07-05 01:08:46.000000000 +0000 +++ 2.4.26.csnap/include/linux/device-mapper.h 2004-07-08 19:41:44.000000000 +0000 @@ -56,6 +56,7 @@ typedef void (*dm_suspend_fn) (struct dm typedef void (*dm_resume_fn) (struct dm_target *ti); typedef int (*dm_status_fn) (struct dm_target * ti, status_type_t status_type, char *result, unsigned int maxlen); +typedef int (*dm_message_fn)(struct dm_target *target, int tag, void *msg, int len); void dm_error(const char *message); @@ -82,6 +83,7 @@ struct target_type { dm_suspend_fn suspend; dm_resume_fn resume; dm_status_fn status; + dm_message_fn message; }; struct dm_target { diff -up --recursive 2.4.26.csnap.clean/include/linux/dm-ioctl.h 2.4.26.csnap/include/linux/dm-ioctl.h --- 2.4.26.csnap.clean/include/linux/dm-ioctl.h 2004-07-05 01:08:46.000000000 +0000 +++ 2.4.26.csnap/include/linux/dm-ioctl.h 2004-07-08 19:41:44.000000000 +0000 @@ -198,6 +198,7 @@ enum { /* Added later */ DM_LIST_VERSIONS_CMD, + DM_MESSAGE_CMD, }; #define DM_IOCTL 0xfd @@ -219,6 +220,7 @@ enum { #define DM_TABLE_STATUS _IOWR(DM_IOCTL, DM_TABLE_STATUS_CMD, struct dm_ioctl) #define DM_LIST_VERSIONS _IOWR(DM_IOCTL, DM_LIST_VERSIONS_CMD, struct dm_ioctl) +#define DM_MESSAGE _IOW(DM_IOCTL, DM_MESSAGE_CMD, int[3]) #define DM_VERSION_MAJOR 4 #define DM_VERSION_MINOR 1