> On 2024/11/21 12:55, Tomohiro Misono wrote: > > commit 44b8f8bf2438 ("mm: memory-failure: add memory failure stats > > Sorry for late, I've been swamped recently. Hi, Thanks for your comments. > > > to sysfs") introduces per NUMA memory error stats which show > > breakdown of HardwareCorrupted of /proc/meminfo in > > /sys/devices/system/node/nodeX/memory_failure. > > Thanks for your patch. > > > > > However, HardwareCorrupted also counts soft-offline pages. So, add > > soft-offline stats in mf_stats too to represent more accurate status. > > Adding soft-offline stats makes sense to me. Thanks for confirming. > > > > > This updates total count as: > > total = recovered + ignored + failed + delayed + soft_offline> > > Test example: > > 1) # grep HardwareCorrupted /proc/meminfo > > HardwareCorrupted: 0 kB > > 2) soft-offline 1 page by madvise(MADV_SOFT_OFFLINE) > > 3) # grep HardwareCorrupted /proc/meminfo > > HardwareCorrupted: 4 kB > > # grep -r "" /sys/devices/system/node/node0/memory_failure > > /sys/devices/system/node/node0/memory_failure/total:1 > > /sys/devices/system/node/node0/memory_failure/soft_offline:1 > > /sys/devices/system/node/node0/memory_failure/recovered:0 > > /sys/devices/system/node/node0/memory_failure/ignored:0 > > /sys/devices/system/node/node0/memory_failure/failed:0 > > /sys/devices/system/node/node0/memory_failure/delayed:0 > > > > Signed-off-by: Tomohiro Misono <misono.tomohiro@xxxxxxxxxxx> > > --- > > Hello > > > > This is RFC because I'm not sure adding SOFT_OFFLINE in enum > > mf_result is a right approach. Also, maybe is it better to move > > update_per_node_mf_stats() into num_poisoned_pages_inc()? > > > > I omitted some cleanups and sysfs doc update in this version to > > highlight changes. I'd appreciate any suggestions. > > > > Regards, > > Tomohiro Misono > > > > include/linux/mm.h | 2 ++ > > include/linux/mmzone.h | 4 +++- > > mm/memory-failure.c | 9 +++++++++ > > 3 files changed, 14 insertions(+), 1 deletion(-) > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index 5d6cd523c7c0..7f93f6883760 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -3991,6 +3991,8 @@ enum mf_result { > > MF_FAILED, /* Error: handling failed */ > > MF_DELAYED, /* Will be handled later */ > > MF_RECOVERED, /* Successfully recovered */ > > + > > + MF_RES_SOFT_OFFLINE, /* Soft-offline */ > > It might not be a good idea to add MF_RES_SOFT_OFFLINE here. 'mf_result' is used to record > the result of memory failure handler. So it might be inappropriate to add MF_RES_SOFT_OFFLINE here. Understood. As I don't see other suitable place to put ENUM value, how about changing like below? Or, do you prefer adding another ENUM type instead of this? ``` static void update_per_node_mf_stats(unsigned long pfn, - enum mf_result result) + enum mf_result result, bool is_soft_offline) { int nid = MAX_NUMNODES; struct memory_failure_stats *mf_stats = NULL; @@ -1299,6 +1299,12 @@ static void update_per_node_mf_stats(unsigned long pfn, } mf_stats = &NODE_DATA(nid)->mf_stats; + if (is_soft_offline) { + ++mf->stats->soft_offlined; + ++mf_stats->total; + return; + } + switch (result) { case MF_IGNORED: ++mf_stats->ignored; ``` Regards, Tomohiro Misono > > > Thanks. > .