On Wed, Oct 14, 2015 at 10:50:40PM +0800, Xiao Guangrong wrote: > On 10/14/2015 05:40 PM, Stefan Hajnoczi wrote: > >On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote: > >>+ out = (dsm_out *)in; > >>+ > >>+ revision = in->arg1; > >>+ function = in->arg2; > >>+ handle = in->handle; > >>+ le32_to_cpus(&revision); > >>+ le32_to_cpus(&function); > >>+ le32_to_cpus(&handle); > >>+ > >>+ nvdebug("UUID " UUID_FMT ".\n", in->arg0[0], in->arg0[1], in->arg0[2], > >>+ in->arg0[3], in->arg0[4], in->arg0[5], in->arg0[6], > >>+ in->arg0[7], in->arg0[8], in->arg0[9], in->arg0[10], > >>+ in->arg0[11], in->arg0[12], in->arg0[13], in->arg0[14], > >>+ in->arg0[15]); > >>+ nvdebug("Revision %#x Function %#x Handler %#x.\n", revision, function, > >>+ handle); > >>+ > >>+ if (revision != DSM_REVISION) { > >>+ nvdebug("Revision %#x is not supported, expect %#x.\n", > >>+ revision, DSM_REVISION); > >>+ goto exit; > >>+ } > >>+ > >>+ if (!handle) { > >>+ if (!dsm_is_root_uuid(in->arg0)) { > > > >Please don't dereference 'in' or pass it to other functions. Avoid race > >conditions with guest vcpus by coping in the entire dsm_in struct. > > > >This is like a system call - the kernel cannot trust userspace memory > >and must copy in before accessing data. The same rules apply. > > > > It's little different for QEMU: > - the memory address is always valid to QEMU, it's not always true for Kernel > due to context-switch > > - we have checked the header before use it's data, for example, when we get > data from GET_NAMESPACE_DATA, we have got the @offset and @length from the > memory, then copy memory based on these values, that means the userspace > has no chance to cause buffer overflow by increasing these values at runtime. > > The scenario for our case is simple but Kernel is difficult to do > check_all_before_use as many paths may be involved. > > - guest changes some data is okay, the worst case is that the label data is > corrupted. This is caused by guest itself. Kernel also supports this kind > of behaviour, e,g. network TX zero copy, the userspace page is being > transferred while userspace can still access it. > > - it's 4K size on x86, full copy wastes CPU time too much. This isn't performance-critical code and I don't want to review it keeping the race conditions in mind the whole time. Also, if the code is modified in the future, the chance of introducing a race is high. I see this as premature optimization, please just copy in data. Stefan -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html