Re: [PATCH 13/18] io_uring: add file set registration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 07, 2019 at 10:22:53AM +0100, Miklos Szeredi wrote:
> On Thu, Feb 07, 2019 at 04:00:59AM +0000, Al Viro wrote:
> 
> > So in theory it would be possible to have
> > 	* thread A: sendmsg() has SCM_RIGHTS created and populated,
> > complete with file refcount and ->inflight increments implied,
> > at which point it gets preempted and loses the timeslice.
> > 	* thread B: gets to run and removes all references
> > from descriptor table it shares with thread A.
> > 	* on another CPU we have garbage collector triggered;
> > it determines the set of potentially unreachable unix_sock and
> > everything in our SCM_RIGHTS _is_ in that set, now that no
> > other references remain.
> > 	* on the first CPU, thread A regains the timeslice
> > and inserts its SCM_RIGHTS into queue.  And it does contain
> > references to sockets from the candidate set of running
> > garbage collector, confusing the hell out of it.
> 
> Reminds me: long time ago there was a bug report, and based on that I found a
> bug in MSG_PEEK handling (not confirmed to have fixed the reported bug).  This
> fix, although pretty simple, got lost somehow.  While unix gc code is in your
> head, can you please review and I'll resend through davem?

Umm...  I think the bug is real (something that looks like an eviction
candidate, but actually is referenced from the reachable queue might
get peeked via that queue, then have _its_ queue modified via new
external reference, all between two passes over that queue, confusing the
fuck out of unix_gc()), but I think the fix is an overkill...

Am I right assuming that this queue-modifying operation is accept(), removing
an embryo unix_sock from the queue of listener and thus hiding SCM_RIGHTS in
_its_ queue from scan_children()?

Let me think of it a bit, OK?  While we are at it, some questions from digging
through the current net/unix/garbage.c:
	1) is there any need for ->inflight to be atomic?  All accesses are under
unix_gc_lock, after all...
	2) pumping unix_gc on each sodding reference in SCM_RIGHTS (within
unix_notinflight()/unix_inflight()) looks atrocious... wouldn't it be better to
hold it over that loop?
	3) unix_get_socket() probably ought to be static nowadays...
	4) I wonder if in scan_inflight()/scan_children() we would be better
off with explicit switch (by enum argument) instead of an indirect call.
	5) do we really need UNIX_GC_MAYBE_CYCLE?  Looks like the only
real use is in inc_inflight_move_tail(), and AFAICS it could bloody well
have been
	u->inflight++;
	if (u->inflight == 1)	// just got from zero to non-zero
		list_move_tail(&u->link, &gc_candidates);
The logics there is "we'd found a reference to something that still was
a candidate for eviction in a reachable SCM_RIGHTS, so it's actually
reachable and needs to be scanned (and removed from the set of candidates);
move to the end of list, so that the main loop gets around to it".
If it *was* past the cursor in the list, there's no need to move it; if
we got past it, it must've had zero ->inflight (or we would've removed
it from the set back when we got past it).  Note that it's only called if
UNIX_GC_CANDIDATE is set (i.e. if it's in the initial candidate set),
so for this one ->inflight is guaranteed to mean the number of SCM_RIGHTS
refs from outside of the current candidate set...
	6) unix_get_socket() looks like it might benefit from another
FMODE bit; not sure where it's best set, though - the obvious way would
be SOCK_GC_CARES in sock->flags, set by e.g. unix_create1(), with
sock_alloc_file() propagating it into file->f_mode.  Then unix_get_socket()
would be able to bugger off with NULL for most of the references in
SCM_RIGHTS, without looking into inode...

Comments?  IIRC, you'd done the last serious round of rewriting unix_gc()
and friends...



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux