Re: [PATCH V2] MCE: fix an error of mce_bad_pages statistics

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>>> Hi Simon,

>>>
>>> If we use "/sys/devices/system/memory/soft_offline_page" to offline a
>>> free page, the value of mce_bad_pages will be added. Then the page is marked
>>> HWPoison, but it is still managed by page buddy alocator.
>>>
>>> So if we offline it again, the value of mce_bad_pages will be added again.
>>> Assume the page is not allocated during this short time.
>>>
>>> soft_offline_page()
>>> 	get_any_page()
>>> 		"else if (is_free_buddy_page(p))" branch return 0
>>> 			"goto done";
>>> 				"atomic_long_add(1, &mce_bad_pages);"
>>>
>>> I think it would be better to move "if(PageHWPoison(page))" at the beginning of
>>> soft_offline_page(). However I don't know what do these words mean,
>>> "Synchronized using the page lock with memory_failure()"
> 
> Hi Xishi,
> 
> Unpoison will clear PG_hwpoison flag after hold page lock, memory_failure() and 
> soft_offline_page() take the lock to avoid unpoison clear the flag behind them.
> 
> Regards,
> Wanpeng Li 
> 

Hi Wanpeng,

As you mean, it is the necessary to get the page lock first when we check the
HWPoison flag every time, this is in order to avoid conflict, right?

So why not use a globe lock here? For example lock_memory_hotplug() is used in
online_pages() and offline_pages()?

Thanks,
Xishi Qiu

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]