Folks, from time to time I update valgrind or qemu to work reasonably well with KVM. Now, newer KVMs have the ability to create subdevices of a KVM guest (e.g. an in kernel kvm interrupt controller) with the following ioctl: #define KVM_CREATE_DEVICE _IOWR(KVMIO, 0xe0, struct kvm_create_device) qemu can then work on these devices with the ioctls /* ioctls for fds returned by KVM_CREATE_DEVICE */ #define KVM_SET_DEVICE_ATTR _IOW(KVMIO, 0xe1, struct kvm_device_attr) #define KVM_GET_DEVICE_ATTR _IOW(KVMIO, 0xe2, struct kvm_device_attr) #define KVM_HAS_DEVICE_ATTR _IOW(KVMIO, 0xe3, struct kvm_device_attr) struct kvm_device_attr { __u32 flags; /* no flags currently defined */ __u32 group; /* device-defined */ __u64 attr; /* group-defined */ __u64 addr; /* userspace address of attr data */ }; to communicate with the in-kvm devices to exchange data. There is an interesting problem here: Properly defined ioctls allow to deduct the written or read area of a ioctl. This is useful, e.g. for valgrind. For unknown ioctls, valgrind will decode the ioctl number according to the _IOxx prefixes and deducts the memory area that the kernels reads/writes. This is very helpful for definedness checkins. For specific or more complex ioctls valgrind has special handling. The problem is now, that looking at the current implementations of kvm device attr, all use 0,1,2, etc for group/attr. So by inspecting the ioctl, valgrind cannot find out what device it is talking to without tracking the original create ioctl. So it seems that by using a multiplexing ioctls we actually make the interface less well defined and more error prone than individual ioctls. Another hint to look at are system calls. Older archs (like i386) have an IPC system call that multiplexes several things. But time has shown that having a system for each and every subfunctions is better that a multiplexer. (so newer archs like amd64 dont have an ipc system call they have semop, semget etc. This also makes tools like strace easier to implement. Whats even more puzzling to me: The main complaint about ioctls is the possibility to create arbitrary interfaces into the kernel. Now with the IORW family, we actually had some limited form of control. With an additional uncoordinated group/attr parameter we dropped all of that. So how to process from here? a) leave it as is and accept false positives from valgrind and undecoded straces b) We could enhance the device_addr group/attr values with size macros, e.g the same as _IOWR (do we need direction?) for future users c) Avoid the device api for new code and go back to individual ioctls in KVM The reason for KVM_CREATE_DEVICE seems to be that in qemu we want to model everything as a device. Having an in-kernel KVM device seemed logical. The problem is, that we should really have a 2nd look at the Linux system call ABI. Does it have to follow the device model? especially if the interface has obvious draw backs. Christian -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html