Re: git pull -- [PATCH] aio: v2 ensure access to ctx->ring_pages is correctly serialised

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Thu, 27 Mar 2014 13:22:11 -0700

On Thu, Mar 27, 2014 at 1:08 PM, Benjamin LaHaise <bcrl@xxxxxxxxx> wrote:
>
> The patch below is lightly tested -- my migration test case is able to
> successfully move the aio ring around multiple times.  It still needs to
> be run through some more thorough tests (like Trinity).  See below for
> the differences relative to your patch.

Ok, from a quick glance-through this fixes my big complaints (not
unrurprisingly, similarly to my patch), and seems to fix  few of the
smaller ones that I didn't bother with.

However, I think you missed the mutex_unlock() in the error paths of
ioctx_alloc().

> What I did instead is to just hold mapping->private_lock over the entire
> operation of aio_migratepage.  That gets rid of the probably broken attempt
> to take a reference count on the kioctx within aio_migratepage(), and makes
> it completely clear that migration won't touch a kioctx after
> put_aio_ring_file() returns.  It also cleans up much of the error handling
> in aio_migratepage() since things cannot change unexpectedly.

Yes, that looks simpler. I don't know what the latency implications
are, but the expensive part (the actual page migration) was and
continues to be under the completion lock with interrupts disabled, so
I guess it's not worse.

It would be good to try to get rid of the completion lock irq thing,
but that looks complex. It would likely require a two-phase migration
model, where phase one is "unmap page from user space and copy it to
new page", and phase two would be "insert new page into mapping". Then
we could have just a "update the tail pointer and the kernel mapping
under the completion lock" thing with interrupts disabled in between.

But that's a much bigger change and requires help from the migration
people. Maybe we don't care about latency here.

> I also added a few comments to help explain the locking.
>
> How does this version look?

Looks ok, except for the error handling wrt mutex_unlock. Did I miss it?

                 Linus
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html