[PATCH V4 0/3] MCE: fix an error of mce_bad_pages statistics

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



When we use "/sys/devices/system/memory/soft_offline_page" to offline a
*free* page, the value of mce_bad_pages will be added, and the page is set
HWPoison flag, but it is still managed by page buddy alocator.

$ cat /proc/meminfo | grep HardwareCorrupted shows the value.

If we offline the same page, the value of mce_bad_pages will be added
*again*, this means the value is incorrect now. Assume the page is
still free during this short time.

soft_offline_page()
	get_any_page()
		"else if (is_free_buddy_page(p))" branch return 0
			"goto done";
				"atomic_long_add(1, &mce_bad_pages);"

Changelog:
V4:
	-use num_poisoned_pages instead of mce_bad_pages
	-remove page lock
V3:
	-add page lock when set HWPoison flag
	-adjust the function structure
V2 and V1:
	-fix the error

Xishi Qiu (3):
  move-poisoned-page-check-at-the-beginning-of-the-function
  fix-function-structure
  use-num_poisoned_pages-instead-of-mce_bad_pages

 fs/proc/meminfo.c   |    2 +-
 include/linux/mm.h  |    2 +-
 mm/memory-failure.c |   76 ++++++++++++++++++++++++++-------------------------
 3 files changed, 41 insertions(+), 39 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]