When we use "/sys/devices/system/memory/soft_offline_page" to offline a *free* page, the value of mce_bad_pages will be added, and the page is set HWPoison flag, but it is still managed by page buddy alocator. $ cat /proc/meminfo | grep HardwareCorrupted shows the value. If we offline the same page, the value of mce_bad_pages will be added *again*, this means the value is incorrect now. Assume the page is still free during this short time. soft_offline_page() get_any_page() "else if (is_free_buddy_page(p))" branch return 0 "goto done"; "atomic_long_add(1, &mce_bad_pages);" Changelog: V4: -use num_poisoned_pages instead of mce_bad_pages -remove page lock V3: -add page lock when set HWPoison flag -adjust the function structure V2 and V1: -fix the error Xishi Qiu (3): move-poisoned-page-check-at-the-beginning-of-the-function fix-function-structure use-num_poisoned_pages-instead-of-mce_bad_pages fs/proc/meminfo.c | 2 +- include/linux/mm.h | 2 +- mm/memory-failure.c | 76 ++++++++++++++++++++++++++------------------------- 3 files changed, 41 insertions(+), 39 deletions(-) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>