On Wed, May 27, 2020 at 02:43:49PM -0700, Kees Cook wrote: > (While I'm here -- why can there be only one listener per task? The > notifications are filter-specific, not task-specific?) Not sure what you mean here? > > To fix this, we introduce a new "live" reference counter that tracks the > > live tasks making use of a given filter and when a notifier is > > registered waiting tasks will be notified that the filter is now empty > > by receiving a (E)POLLHUP event. > > The concept in this patch introduces is the same as for signal_struct, > > i.e. reference counting for life-cycle management is decoupled from > > reference counting live taks using the object. > > I will need convincing that life-cycle ref-counting needs to be decoupled > from usage ref-counting. I think it does, since the refcount is no longer 1:1 with the number of tasks that have it (a notification fd's struct file has a reference too). We could also do it the reverse way, and keep track of how many notification fds point to a particular file. But somehow we need two counts. Maybe it's best to decouple them entirely, and have usage go back to just being the number of tasks, and introduce a new counter for notification fds. > I see what you're saying here and in the other > reply about where the notification is coming from (release vs put, etc), > but I think it'd be better if the EPOLLHUP was handled internally to the > VFS due to the kernel end of the file being closed. > > > There's probably some trickery possible but the second counter is just > > the correct way of doing this imho and has precedence. The patch also > > lifts the waitqeue from struct notification into into sruct > > seccomp_filter. This is cleaner overall and let's us avoid having to > > take the notifier mutex since we neither need to read nor modify the > > notifier specific aspects of the seccomp filter. In the exit path I'd > > very much like to avoid having to take the notifier mutex for each > > filter in the task's filter hierarchy. > > I guess this is a minor size/speed trade-off (every seccomp_filter > struct grows by 1 pointer regardless of the presence of USER_NOTIF > rules attached...). But I think this is an optimization detail, and I > need to understand why we can't just close the file on filter free. That seems nicest, agreed. Tycho _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers