On Thu, Sep 08, 2011 at 11:33:09AM +0200, Patrick McHardy wrote: > Am 07.09.2011 22:03, schrieb Michał Mirosław: > > On Wed, Sep 07, 2011 at 05:22:00PM +0200, Patrick McHardy wrote: > >> On 04.09.2011 18:18, Michał Mirosław wrote: > >>> On Sat, Sep 03, 2011 at 07:26:08PM +0200, kaber@xxxxxxxxx wrote: > >>>> From: Patrick McHardy <kaber@xxxxxxxxx> > >>>> > >>>> Add support for memory mapped sendmsg() to netlink. Userspace queued to > >>>> be processed frames into the TX ring and invokes sendmsg with > >>>> msg.iov.iov_base = NULL to trigger processing of all pending messages. > >>>> > >>>> Since the kernel usually performs full message validation before beginning > >>>> processing, userspace must be prevented from modifying the message > >>>> contents while the kernel is processing them. In order to do so, the > >>>> frames contents are copied to an allocated skb in case the the ring is > >>>> mapped more than once or the file descriptor is shared (f.i. through > >>>> AF_UNIX file descriptor passing). > >>>> > >>>> Otherwise an skb without a data area is allocated, the data pointer set > >>>> to point to the data area of the ring frame and the skb is processed. > >>>> Once the skb is freed, the destructor releases the frame back to userspace > >>>> by setting the status to NL_MMAP_STATUS_UNUSED. > >>> > >>> Is this protected from threads? Like: one thread waits on sendmsg() and > >>> another (same process) changes the buffer. > >> Yes, if the ring is mapped multiple times (or the file descriptor > >> is changed), the contents are copied to an allocated skb. > > > > I mean: > > > > [1] mmap() > > [1] fill buffers > > [1] pthread_create() [creates: 2] > > [1] sendmsg() starts > > [2] modify buffers > > [1] sendmsg() returns > > > > So: no multiple mmaps, and no touching of the fd. I haven't dug into > > filesystem layer to see if threads affect file->f_count, but there > > sure are no multiple mappings here. > If CLONE_VM is given to clone(), the mapping is visible in both > threads and thus we have multiple mappings (vma_ops->open() is > invoked through clone()). Without CLONE_VM, the second thread > can't access the ring unless it mmap()s it itself, in case we'd > also have multiple mappings. I made a quick look into kernel/fork.c, and it looks to me that if CLONE_VM is set, then vma->open() is actually avoided --- it's called via dup_mm() and dup_mmap() only if CLONE_VM is not there and the VMA needs to be copied. Best Regards, Michał Mirosław -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html