On Tue, Nov 9, 2010 at 6:22 AM, Neil Horman <nhorman@xxxxxxxxxxxxx> wrote: > On Mon, Nov 08, 2010 at 12:33:35PM -0800, Mike Waychison wrote: >> We need to be able to gather information about the CPUs that caused the crash. >> >> This commit only handles x86, but it is desirable to come up with some new >> packet format that can accommodate any architecture. >> >> Signed-off-by: Mike Waychison <mikew@xxxxxxxxxx> >> --- >> TODO: This should be made more general to other architectures. As is, we are >> probably okay exporting some value for the 'arch' field. Different >> architectures though will likely want to gather different data. >> --- >> drivers/net/netoops.c | 27 +++++++++++++++++++++------ >> 1 files changed, 21 insertions(+), 6 deletions(-) >> > Not sure I see the value in encapsulating arch specific data in a netoops > message. Ostensibly this information can be inferred at the time of the crash > by the name/ip of the system crashing (one presumes that the sysadmin knows what > systems are what arch, or can look it up easily). This actually becomes harder than it appears at first. The distributed nature of our systems means that we cannot ever rely on a central data source that describes the machines we have without having to worry about network partitions and service downtimes. The alternative is to post-process crashes, looking up machine information in various data sources and hoping that the results are consistent. This becomes yet another job in the cluster, which seems a little silly when we could just have the machine self describe itself at the time of the crash. > > If thats not the case, why not just dump out the contents of /proc/cpuinfo in > ascii form, so that no arch specific data is needed? As a segment of the dump? I'm okay with doing this, as long it never makes it's way into log_buf. log_buf is a real pain to parse given the lack of transactions and the fact that many other cores may be scribbling all over it. A couple years ago, we speced out a different wire protocol for these packets, version 3 (yes, this has already had a version bump). Anyhow, we came up with a design that used (key,length)->value fields. Keys were designed to be 16bit wide integers and clients could easily ignore fields that it doesn't understand. We never implemented this, but it'd be great if folks bought into it. It'd allow us to ship things like file contents side by side with other structured fields like pt_regs snapshots, the log_buf and a user defined buffer. How do folks feel about something like that? -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html