Re: About the try to remove cross-release feature entirely by Ingo

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Mon, 1 Jan 2018 02:18:55 -0800

On Sat, Dec 30, 2017 at 06:00:57PM -0500, Theodore Ts'o wrote:
> On Sat, Dec 30, 2017 at 05:40:28PM -0500, Theodore Ts'o wrote:
> > On Sat, Dec 30, 2017 at 12:44:17PM -0800, Matthew Wilcox wrote:
> > > 
> > > I'm not sure I agree with this part.  What if we add a new TCP lock class
> > > for connections which are used for filesystems/network block devices/...?
> > > Yes, it'll be up to each user to set the lockdep classification correctly,
> > > but that's a relatively small number of places to add annotations,
> > > and I don't see why it wouldn't work.
> > 
> > I was exagerrating a bit for effect, I admit.  (but only a bit).

I feel like there's been rather too much of that recently.  Can we stick
to facts as far as possible, please?

> > It can probably be for all TCP connections that are used by kernel
> > code (as opposed to userspace-only TCP connections).  But it would
> > probably have to be each and every device-mapper instance, each and
> > every block device, each and every mounted file system, each and every
> > bdi object, etc.
> 
> Clarification: all TCP connections that are used by kernel code would
> need to be in their own separate lock class.  All TCP connections used
> only by userspace could be in their own shared lock class.  You can't
> use a one lock class for all kernel-used TCP connections, because of
> the Network Block Device mounted on a local file system which is then
> exported via NFS and squirted out yet another TCP connection problem.

So the false positive you're concerned about is write-comes-in-over-NFS
(with socket lock held), NFS sends a write request to local filesystem,
local filesystem sends write to block device, block device sends a
packet to a socket which takes that socket lock.

I don't think we need to be as drastic as giving each socket its own lock
class to solve this.  All NFS sockets can be in lock class A; all NBD
sockets can be in lock class B; all user sockets can be in lock class
C; etc.

> Also, what to do with TCP connections which are created in userspace
> (with some authentication exchanges happening in userspace), and then
> passed into kernel space for use in kernel space, is an interesting
> question.

Yes!  I'd love to have a lockdep expert weigh in here.  I believe it's
legitimate to change a lock's class after it's been used, essentially
destroying it and reinitialising it.  If not, it should be because it's
a reasonable design for an object to need different lock classes for
different phases of its existance.

> So "all you have to do is classify the locks 'properly'" is much like
> the apocrophal, "all you have to do is bell the cat"[1].  Or like the
> saying, "colonizing the stars is *easy*; all you have to do is figure
> out faster than light travel."

This is only computer programming, not rocket surgery :-)