> On Jan 9, 2018, at 5:35 PM, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: > > > >> On Jan 4, 2018, at 2:36 PM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: >> >> On Thu, 4 Jan 2018 14:10:21 -0500 >> Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: >> >>>>> #define nfs_show_stable(stable) \ >>>>> __print_symbolic(stable, \ >>>>> { NFS_UNSTABLE, " (UNSTABLE)" }, \ >>>>> { NFS_DATA_SYNC, " (DATA_SYNC)" }, \ >>>>> { NFS_FILE_SYNC, " (FILE_SYNC)" }) >>>>> >>>>> When the field contains 0 it displays "(UNSTABLE)", and when the field >>>>> contains a non-zero value the displayed symbol is blank. >>>> nfs_initiate_write, >>>> Can you show my what is in >> >> Are you sure that it isn't another number there? >> >>>> >>>> /sys/kernel/debug/tracing/events/nfs/nfs_initiate_write/format >>> >>> This is with current v4.15-rc6 code base: >>> >>> name: nfs_initiate_write >>> ID: 1818 >>> format: >>> field:unsigned short common_type; offset:0; size:2; signed:0; >>> field:unsigned char common_flags; offset:2; size:1; signed:0; >>> field:unsigned char common_preempt_count; offset:3; size:1; signed:0; >>> field:int common_pid; offset:4; size:4; signed:1; >>> >>> field:loff_t offset; offset:8; size:8; signed:1; >>> field:unsigned long count; offset:16; size:8; signed:0; >>> field:enum nfs3_stable_how stable; offset:24; size:4; signed:1; >>> field:dev_t dev; offset:28; size:4; signed:0; >>> field:u32 fhandle; offset:32; size:4; signed:0; >>> field:u64 fileid; offset:40; size:8; signed:0; >>> >>> print fmt: "fileid=%02x:%02x:%llu fhandle=0x%08x offset=%lld count=%lu stable=%d%s", ((unsigned int) ((REC->dev) >> 20)), ((unsigned int) ((REC->dev) & ((1U << 20) - 1))), (unsigned long long)REC->fileid, REC->fhandle, REC->offset, REC->count, REC->stable, __print_symbolic(REC->stable, { 0, " (UNSTABLE)" }, { 1, " (DATA_SYNC)" }, { 2, " (FILE_SYNC)" }) >> >> This looks like it should work fine. >> >> Can you try this: >> >> trace-cmd record -e nfs_initiate_write >> >> do stuff to trigger the event, then hit Ctrl^C >> >> See if "trace-cmd report" shows it. If not do: >> >> trace-cmd report -R >> >> Which will suppress the translation of the output and show you the raw >> values. That way, you will see what is in the stable field. > > This is with: > > TRACE_DEFINE_ENUM(NFS_UNSTABLE); > TRACE_DEFINE_ENUM(NFS_DATA_SYNC); > TRACE_DEFINE_ENUM(NFS_FILE_SYNC); > > #define nfs_show_stable(stable) \ > __print_symbolic(stable, \ > { NFS_UNSTABLE, "NFS_UNSTABLE" }, \ > { NFS_DATA_SYNC, "NFS_DATA_SYNC" }, \ > { NFS_FILE_SYNC, "NFS_FILE_SYNC" }) > > Here's "cat /sys/kernel/debug/events/tracing/nfs/nfs_initiate_write/format" : > > print_fmt: "fileid=%02x:%02x:%llu fhandle=0x%08x offset=%lld count=%lu stable=%d (%s)", ((unsigned int) ((REC->dev) >> 20)), ((unsigned int) ((REC->dev) & ((1U << 20) - 1))), (unsigned long long)REC->fileid, REC->fhandle, REC->offset, REC->count, REC->stable, __print_symbolic(REC->stable, { 0, "NFS_UNSTABLE" }, { 1, "NFS_DATA_SYNC" }, { NFS_FILE_SYNC, "NFS_FILE_SYNC" }) > > The last element in the __print_symbolic table is wrong. For comparison, > here's an excerpt of "strings nfs.ko | grep NFS_FILE_SYNC" : > > "fileid=%02x:%02x:%llu fhandle=0x%08x offset=%lld count=%lu stable=%d (%s)", ((unsigned int) ((REC->dev) >> 20)), ((unsigned int) ((REC->dev) & ((1U << 20) - 1))), (unsigned long long)REC->fileid, REC->fhandle, REC->offset, REC->count, REC->stable, __print_symbolic(REC->stable, { NFS_UNSTABLE, "NFS_UNSTABLE" }, { NFS_DATA_SYNC, "NFS_DATA_SYNC" }, { NFS_FILE_SYNC, "NFS_FILE_SYNC" }) > > Note this copy of the table contains three symbolic names. I haven't been > able to find the code that adjusts the print_fmt string to understand the > difference with the output from "cat /sys/kernel ..." . But that code > doesn't seem to adjust the last element in the __print_symbolic table. > > > trace-cmd report > > <...>-11353 [001] 6211.728494: nfs_initiate_write: fileid=00:2a:301409 fhandle=0x820b4b2e offset=0 count=601 stable=2 () > <...>-11353 [004] 6211.728783: nfs_initiate_write: fileid=00:2a:301410 fhandle=0x0c844ccd offset=0 count=738 stable=2 () > <...>-11353 [001] 6211.729023: nfs_initiate_write: fileid=00:2a:301411 fhandle=0xc02e4c53 offset=0 count=4015 stable=2 () > <...>-11353 [005] 6211.729351: nfs_initiate_write: fileid=00:2a:301412 fhandle=0xcaeb454a offset=0 count=282 stable=2 () > <...>-11353 [003] 6211.729651: nfs_initiate_write: fileid=00:2a:301413 fhandle=0x064145d4 offset=0 count=7688 stable=2 () > > trace-cmd report -R > > <...>-11353 [001] 6211.728494: nfs_initiate_write: offset=0 count=0x259 stable=2 dev=42 fhandle=2181778222 fileid=301409 > <...>-11353 [004] 6211.728783: nfs_initiate_write: offset=0 count=0x2e2 stable=2 dev=42 fhandle=209997005 fileid=301410 > <...>-11353 [001] 6211.729023: nfs_initiate_write: offset=0 count=0xfaf stable=2 dev=42 fhandle=3224259667 fileid=301411 > <...>-11353 [005] 6211.729351: nfs_initiate_write: offset=0 count=0x11a stable=2 dev=42 fhandle=3404416330 fileid=301412 > <...>-11353 [003] 6211.729651: nfs_initiate_write: offset=0 count=0x1e08 stable=2 dev=42 fhandle=104941012 fileid=301413 > > Thanks for your advice! The problem is in this function (kernel/trace/trace_events.c): 2212 void trace_event_eval_update(struct trace_eval_map **map, int len) 2213 { 2214 struct trace_event_call *call, *p; 2215 const char *last_system = NULL; 2216 int last_i; 2217 int i; 2218 2219 down_write(&trace_event_sem); 2220 list_for_each_entry_safe(call, p, &ftrace_events, list) { 2221 /* events are usually grouped together with systems */ 2222 if (!last_system || call->class->system != last_system) { 2223 last_i = 0; 2224 last_system = call->class->system; 2225 } 2226 2227 for (i = last_i; i < len; i++) { 2228 if (call->class->system == map[i]->system) { 2229 /* Save the first system if need be */ 2230 if (!last_i) 2231 last_i = i; 2232 update_event_printk(call, map[i]); 2233 } 2234 } 2235 } 2236 up_write(&trace_event_sem); 2237 } Loading the nfs.ko module adds an eval map consisting of three entries: entry 0 is NFS_FILE_SYNC with a value of 2 entry 1 is NFS_DATA_SYNC with a value of 1 entry 2 is NFS_UNSTABLE with a value of 0 The nfs.ko trace point print_fmt strings are updated by trace_event_eval_update. For the first trace point in nfs.ko, the first iteration of the inner loop visits all three of these map entries, but it's print_fmt does not contain any of these enum symbols, so nothing changes. Now thanks to the "last_i" logic above, entry 0 (NFS_FILE_SYNC) is skipped over for other trace points in the module. Thus when the print_fmt strings for "nfs_initiate_write" and "nfs_writeback_done" are updated, they are only partially converted, hence: > "fileid=%02x:%02x:%llu fhandle=0x%08x offset=%lld count=%lu stable=%d (%s)", ((unsigned int) ((REC->dev) >> 20)), ((unsigned int) ((REC->dev) & ((1U << 20) - 1))), (unsigned long long)REC->fileid, REC->fhandle, REC->offset, REC->count, REC->stable, __print_symbolic(REC->stable, { 0, "NFS_UNSTABLE" }, { 1, "NFS_DATA_SYNC" }, { NFS_FILE_SYNC, "NFS_FILE_SYNC" }) The updated string contains integers for the first two items in the symbol table, and has an unconverted third item, which cannot be used by "trace-cmd report". Adding another TRACE_DEFINE_ENUM in fs/nfs/nfstrace.h works around this issue. I added TRACE_DEFINE_ENUM(NFS_INVALID_STABLE_HOW); _after_ the first three, and that became entry 0 in the nfs system eval_map. That entry is unusable due to this bug, but I didn't add it to any trace points. The updated print_fmt strings are now 100% correct. "trace-cmd report" works exactly as intended. My inclination is to remove the last_i logic from this function, but I don't understand what the last_system/last_i logic is attempting accomplish. Do you have any suggestions? -- Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html