Re: [PATCH v6 02/15] memory-hotplug: check whether all memory blocks are offlined or not when removing memory

Tang Chen <tangchen@xxxxxxxxxxxxxx> · Thu, 10 Jan 2013 13:56:07 +0800

Hi Andrew,

On 01/10/2013 07:11 AM, Andrew Morton wrote:
On Wed, 9 Jan 2013 17:32:26 +0800
Tang Chen<tangchen@xxxxxxxxxxxxxx>  wrote:

We remove the memory like this:
1. lock memory hotplug
2. offline a memory block
3. unlock memory hotplug
4. repeat 1-3 to offline all memory blocks
5. lock memory hotplug
6. remove memory(TODO)
7. unlock memory hotplug

All memory blocks must be offlined before removing memory. But we don't hold
the lock in the whole operation. So we should check whether all memory blocks
are offlined before step6. Otherwise, kernel maybe panicked.

Well, the obvious question is: why don't we hold lock_memory_hotplug()
for all of steps 1-4?  Please send the reasons for this in a form which
I can paste into the changelog.

In the changelog form:

Offlining a memory block and removing a memory device can be two
different operations. Users can just offline some memory blocks
without removing the memory device. For this purpose, the kernel has
held lock_memory_hotplug() in __offline_pages(). To reuse the code
for memory hot-remove, we repeat step 1-3 to offline all the memory
blocks, repeatedly lock and unlock memory hotplug, but not hold the
memory hotplug lock in the whole operation.

Actually, I wonder if doing this would fix a race in the current
remove_memory() repeat: loop.  That code does a
find_memory_block_hinted() followed by offline_memory_block(), but
afaict find_memory_block_hinted() only does a get_device().  Is the
get_device() sufficiently strong to prevent problems if another thread
concurrently offlines or otherwise alters this memory_block's state?

I think we already have memory_block->state_mutex to protect the
concurrently changing of memory_block's state.

The find_memory_block_hinted() here is to find the memory_block
corresponding to the memory section we are dealing with.

Thanks. :)

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html