Hi Benjamin, Mel,
Please see below.
On 05/14/2013 09:58 PM, Benjamin LaHaise wrote:
On Tue, May 14, 2013 at 09:24:58AM +0800, Tang Chen wrote:
Hi Mel, Benjamin, Jeff,
On 05/13/2013 11:01 PM, Benjamin LaHaise wrote:
On Mon, May 13, 2013 at 10:54:03AM -0400, Jeff Moyer wrote:
How do you propose to move the ring pages?
It's the same problem as doing a TLB shootdown: flush the old pages from
userspace's mapping, copy any existing data to the new pages, then
repopulate the page tables. It will likely require the addition of
address_space_operations for the mapping, but that's not too hard to do.
I think we add migrate_unpin() callback to decrease page->count if
necessary,
and migrate the page to a new page, and add migrate_pin() callback to pin
the new page again.
You can't just decrease the page count for this to work. The pages are
pinned because aio_complete() can occur at any time and needs to have a
place to write the completion events. When changing pages, aio has to
take the appropriate lock when changing one page for another.
In aio_complete(),
aio_complete() {
......
spin_lock_irqsave(&ctx->completion_lock, flags);
//write the completion event.
spin_unlock_irqrestore(&ctx->completion_lock, flags);
......
}
So for this problem, I think we can hold ctx->completion_lock in the aio
callbacks to prevent aio subsystem accessing pages who are being migrated.
The migrate procedure will work just as before. We use callbacks to
decrease
the page->count before migration starts, and increase it when the migration
is done.
And migrate_pin() and migrate_unpin() callbacks will be added to
struct address_space_operations.
I think the existing migratepage operation in address_space_operations can
be used. Does it get called when hot unplug occurs? That is: is testing
with the migrate_pages syscall similar enough to the memory removal case?
But as I said, for anonymous pages such as aio ring buffer, they don't have
address_space_operations. So where should we put the callbacks' pointers ?
Add something like address_space_operations to struct anon_vma ?
Thanks. :)
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html