On 6/19/23 8:40 AM, Stefan Metzmacher wrote: > Am 19.06.23 um 16:38 schrieb Jens Axboe: >> On 6/19/23 7:27?AM, Stefan Metzmacher wrote: >>> Am 19.06.23 um 15:09 schrieb Stefan Metzmacher: >>>> Am 19.06.23 um 15:05 schrieb Jens Axboe: >>>>> On 6/19/23 3:57?AM, Stefan Metzmacher wrote: >>>>>> Hi Jens, >>>>>> >>>>>>> If the application sets ->msg_control and we have to later retry this >>>>>>> command, or if it got queued with IOSQE_ASYNC to begin with, then we >>>>>>> need to retain the original msg_control value. This is due to the net >>>>>>> stack overwriting this field with an in-kernel pointer, to copy it >>>>>>> in. Hitting that path for the second time will now fail the copy from >>>>>>> user, as it's attempting to copy from a non-user address. >>>>>> >>>>>> I'm not 100% sure about the impact of this change. >>>>>> >>>>>> But I think the logic we need is that only the >>>>>> first __sys_sendmsg_sock() that returns > 0 should >>>>>> see msg_control. A retry because of MSG_WAITALL should >>>>>> clear msg_control[len] for a follow up __sys_sendmsg_sock(). >>>>>> And I fear the patch below would not clear it... >>>>>> >>>>>> Otherwise the receiver/socket-layer will get the same msg_control twice, >>>>>> which is unexpected. >>>>> >>>>> Yes agree, if we do transfer some (but not all) data and WAITALL is set, >>>>> it should get cleared. I'll post a patch for that. >>>> >>>> Thanks! >>>> >>>>> Note that it was also broken before, just differently broken. The most >>>>> likely outcome here was a full retry and now getting -EFAULT. >>>> >>>> Yes, I can see that it was broken before... >>> >>> I haven't checked myself, but I'm wondering about the recvmsg case, >>> I guess we would need to advance the msg_control buffer after each >>> iteration, in order to avoid overwritting the already received messages >>> on retry. >>> >>> This all gets complicated with things like MSG_CTRUNC. >>> >>> I guess it's too late to reject MSG_WAITALL together with msg_control >>> for io_recvmsg() because of compat reasons, >>> but as MSG_WAITALL is also processed in the socket layer, we could keep it >>> simple for now and skip the this retry logic: >>> >>> if (flags & MSG_WAITALL) >>> min_ret = iov_iter_count(&kmsg->msg.msg_iter); >>> >>> This might become something similar to this, >>> but likely more complex, as would need to record kmsg->controllen == 0 >>> condition already in io_recvmsg_prep: >>> >>> if (flags & MSG_WAITALL && kmsg->controllen == 0) >>> min_ret = iov_iter_count(&kmsg->msg.msg_iter); >> >> Yep agree, I think this is the best way - ensure that once we transfer >> data with cmsg, it's a one-shot kind of deal. >> >> Do you want to cut a patch for that one? > > No, sorry I'm busy with other stuff and not able to to do any testing... OK that's fine, I'll post both. -- Jens Axboe