On Tue, Jun 08, 2021 at 10:24:50AM +0800, wangbin wrote: > > What specific problem are you trying to solve? Are trying to see how > > many huge pages were hit by memory errors? > > Yes, I'd like to know how many huge pages are not available because of > the memory errors. Just like HardwareCorrupted in the /proc/meminfo. > But the HardwareCorrupted only adds one page size when a huge page is > hit by memory errors, and mixes with normal pages. So I think we should > add a new counts to track the memory errors on hugetlbfs. If you can use root privilege in your use-case, an easy way to get the number of corrupted hugepages is to use page-types.c (which reads /proc/kpageflags) like below: $ page-types -b huge,hwpoison=huge,hwpoison flags page-count MB symbolic-flags long-symbolic-flags 0x00000000000a8000 1 0 _______________H_G_X_______________________ compound_head,huge,hwpoison total 1 0 But I guess that many usecases do not permit access to this interface, where some new accounting interface for corrupted hugepages could be helpful as you suggest. Thanks, Naoya Horiguchi