Hi Ross, On Tue, 23 May 2017 15:25:57 -0600 Ross Zwisler wrote: > One of the primary motivations for adding tracepoints to the DAX PMD path > was to allow the user to diagnose whether their system was actually using > PMDs, and if not to help them understand why. For me at least this has > worked okay in some situations, but many times I find myself adding more > debugging to diagnose fallback reasons that aren't immediately obvious, or > situations where the current tracepoints are simply insufficient because > they don't give you enough information. > > This series adds short fallback reason strings to the tracepoints in the > PMD path with the intention of giving the user better information about why > their system is falling back to PTEs. > > So, for example, the recent case on my system was that I forgot to move my > namespace from "raw" mode to "memory" mode, which resulted in this output: > > big-1046 [000] .... 103.930950: dax_pmd_fault: dev 259:0 ino 0xc shared > WRITE|ALLOW_RETRY|KILLABLE|USER address 0x10505000 vm_start 0x10200000 > vm_end 0x10700000 pgoff 0x305 max_pgoff 0x1400 > > big-1046 [000] .... 103.931220: dax_pmd_insert_mapping_fallback: dev > 259:0 ino 0xc shared write address 0x10505000 length 0x200000 pfn 0x24a200 > DEV radix_entry 0x0 > > big-1046 [000] .... 103.931222: dax_pmd_fault_done: dev 259:0 ino 0xc > shared WRITE|ALLOW_RETRY|KILLABLE|USER address 0x10505000 vm_start > 0x10200000 vm_end 0x10700000 pgoff 0x305 max_pgoff 0x1400 FALLBACK > > The issue is that the PFN_MAP flag isn't set because we're lacking struct > page for our PMEM namespace. It's not immediately obvious why this > fallback happened, and actually you can't even diagnose it because we mask > off the pfn_t flags before printing the PFN. :( > > This new output: > > big-1011 [000] .... 36.164708: dax_pmd_fault: dev 259:0 ino 0xc shared > WRITE|ALLOW_RETRY|KILLABLE|USER address 0x10505000 vm_start 0x10200000 > vm_end 0x10700000 pgoff 0x305 max_pgoff 0x1400 > > big-1011 [000] .... 36.165521: dax_pmd_insert_mapping_fallback: dev > 259:0 ino 0xc shared write address 0x10505000 length 0x200000 pfn > 0x2000000000249200 DEV radix_entry 0x0 pfn_t not devmap > > big-1011 [000] .... 36.165524: dax_pmd_fault_done: dev 259:0 ino 0xc > shared WRITE|ALLOW_RETRY|KILLABLE|USER address 0x10505000 vm_start > 0x10200000 vm_end 0x10700000 pgoff 0x305 max_pgoff 0x1400 FALLBACK > > adds the "pfn_t not devmap" string to the second line, telling the user > exactly what's going on. I also stopped masking off the pfn_t flags so the > output was more complete. I think this idea is good. How about adding a suffix to be easy for grep? For example the suffix, DAX_FAULT_. And, to simplify the code, how about moving the string of fault reason to include/trace/events/fs_dax.h, like as include/trace/events/kvm.h? Regards, Masayoshi Mizuma > > My only concern is that somehow adding strings like this to tracepoint > output, brief and useful though they may be, is somehow breaking what > tracepoints are supposed to be doing. If anyone feels strongly about this > I guess I can just keep these changes locally and try to keep enhancing the > existing output without adding strings. > > Ross Zwisler (3): > dax: add fallback reason to dax_iomap_pmd_fault() > dax: add fallback reason to dax_pmd_insert_mapping() > dax: add fallback reason to dax_pmd_load_hole() > > fs/dax.c | 76 +++++++++++++++++++++++++++++++------------ > include/trace/events/fs_dax.h | 50 +++++++++++++++++----------- > 2 files changed, 86 insertions(+), 40 deletions(-) > > -- > 2.9.4 > > _______________________________________________ > Linux-nvdimm mailing list > Linux-nvdimm@xxxxxxxxxxxx > https://lists.01.org/mailman/listinfo/linux-nvdimm